Display options
Share it on

J Biomed Inform. 2011 Dec;44:S31-S38. doi: 10.1016/j.jbi.2011.04.007. Epub 2011 Apr 29.

Enabling enrichment analysis with the Human Disease Ontology.

Journal of biomedical informatics

Paea LePendu, Mark A Musen, Nigam H Shah

Affiliations

  1. Stanford Center for Biomedical Informatics Research, 251 Campus Drive, Medical School Office Building, Room X215, Mail Code 5479, Stanford University, Stanford, CA 94305-5479, USA. Electronic address: [email protected].
  2. Stanford Center for Biomedical Informatics Research, 251 Campus Drive, Medical School Office Building, Room X215, Mail Code 5479, Stanford University, Stanford, CA 94305-5479, USA.

PMID: 21550421 PMCID: PMC3392036 DOI: 10.1016/j.jbi.2011.04.007

Abstract

Advanced statistical methods used to analyze high-throughput data such as gene-expression assays result in long lists of "significant genes." One way to gain insight into the significance of altered expression levels is to determine whether Gene Ontology (GO) terms associated with a particular biological process, molecular function, or cellular component are over- or under-represented in the set of genes deemed significant. This process, referred to as enrichment analysis, profiles a gene set, and is widely used to make sense of the results of high-throughput experiments. Our goal is to develop and apply general enrichment analysis methods to profile other sets of interest, such as patient cohorts from the electronic medical record, using a variety of ontologies including SNOMED CT, MedDRA, RxNorm, and others. Although it is possible to perform enrichment analysis using ontologies other than the GO, a key pre-requisite is the availability of a background set of annotations to enable the enrichment calculation. In the case of the GO, this background set is provided by the Gene Ontology Annotations. In the current work, we describe: (i) a general method that uses hand-curated GO annotations as a starting point for creating background datasets for enrichment analysis using other ontologies; and (ii) a gene-disease background annotation set - that enables disease-based enrichment - to demonstrate feasibility of our method.

Copyright © 2011 Elsevier Inc. All rights reserved.

References

  1. Methods Mol Biol. 2010;593:341-82 - PubMed
  2. BMC Genomics. 2009 Jul 07;10 Suppl 1:S6 - PubMed
  3. J Biomed Inform. 2001 Oct;34(5):301-10 - PubMed
  4. Bioinformatics. 2005 Sep 1;21 Suppl 2:ii252-8 - PubMed
  5. Cell Mol Life Sci. 2007 Oct;64(19-20):2620-41 - PubMed
  6. Aging Cell. 2009 Feb;8(1):65-72 - PubMed
  7. Hum Mutat. 2010 Mar;31(3):335-46 - PubMed
  8. Web Semant. 2011 Sep 1;9(3):316-324 - PubMed
  9. Int J Biochem Cell Biol. 2005 May;37(5):947-60 - PubMed
  10. Best Pract Res Clin Rheumatol. 2007 Oct;21(5):885-906 - PubMed
  11. AMIA Annu Symp Proc. 2010 Nov 13;2010:907-11 - PubMed
  12. Nat Biotechnol. 2010 May;28(5):495-501 - PubMed
  13. Cell Mol Life Sci. 2007 Jan;64(2):155-70 - PubMed
  14. Clin Interv Aging. 2008;3(3):431-44 - PubMed
  15. Bioinformatics. 2005 Sep 15;21(18):3587-95 - PubMed
  16. Summit Transl Bioinform. 2009 Mar 01;2009:56-60 - PubMed
  17. Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50 - PubMed
  18. Best Pract Res Clin Rheumatol. 2009 Feb;23(1):71-82 - PubMed
  19. Nucleic Acids Res. 2009 Jul;37(Web Server issue):W170-3 - PubMed
  20. Proc Natl Acad Sci U S A. 2007 May 22;104(21):8685-90 - PubMed
  21. Vasc Health Risk Manag. 2008;4(3):605-14 - PubMed
  22. AMIA Annu Symp Proc. 2010 Nov 13;2010:797-801 - PubMed
  23. BMC Musculoskelet Disord. 2010 Oct 25;11:247 - PubMed

MeSH terms

Publication Types

Grant support