J Biomed Inform. 2011 Dec;44:S31-S38. doi: 10.1016/j.jbi.2011.04.007. Epub 2011 Apr 29.
Enabling enrichment analysis with the Human Disease Ontology.
Journal of biomedical informatics
Paea LePendu, Mark A Musen, Nigam H Shah
Affiliations
Affiliations
- Stanford Center for Biomedical Informatics Research, 251 Campus Drive, Medical School Office Building, Room X215, Mail Code 5479, Stanford University, Stanford, CA 94305-5479, USA. Electronic address: [email protected].
- Stanford Center for Biomedical Informatics Research, 251 Campus Drive, Medical School Office Building, Room X215, Mail Code 5479, Stanford University, Stanford, CA 94305-5479, USA.
PMID: 21550421
PMCID: PMC3392036 DOI: 10.1016/j.jbi.2011.04.007
Abstract
Advanced statistical methods used to analyze high-throughput data such as gene-expression assays result in long lists of "significant genes." One way to gain insight into the significance of altered expression levels is to determine whether Gene Ontology (GO) terms associated with a particular biological process, molecular function, or cellular component are over- or under-represented in the set of genes deemed significant. This process, referred to as enrichment analysis, profiles a gene set, and is widely used to make sense of the results of high-throughput experiments. Our goal is to develop and apply general enrichment analysis methods to profile other sets of interest, such as patient cohorts from the electronic medical record, using a variety of ontologies including SNOMED CT, MedDRA, RxNorm, and others. Although it is possible to perform enrichment analysis using ontologies other than the GO, a key pre-requisite is the availability of a background set of annotations to enable the enrichment calculation. In the case of the GO, this background set is provided by the Gene Ontology Annotations. In the current work, we describe: (i) a general method that uses hand-curated GO annotations as a starting point for creating background datasets for enrichment analysis using other ontologies; and (ii) a gene-disease background annotation set - that enables disease-based enrichment - to demonstrate feasibility of our method.
Copyright © 2011 Elsevier Inc. All rights reserved.
References
- Methods Mol Biol. 2010;593:341-82 - PubMed
- BMC Genomics. 2009 Jul 07;10 Suppl 1:S6 - PubMed
- J Biomed Inform. 2001 Oct;34(5):301-10 - PubMed
- Bioinformatics. 2005 Sep 1;21 Suppl 2:ii252-8 - PubMed
- Cell Mol Life Sci. 2007 Oct;64(19-20):2620-41 - PubMed
- Aging Cell. 2009 Feb;8(1):65-72 - PubMed
- Hum Mutat. 2010 Mar;31(3):335-46 - PubMed
- Web Semant. 2011 Sep 1;9(3):316-324 - PubMed
- Int J Biochem Cell Biol. 2005 May;37(5):947-60 - PubMed
- Best Pract Res Clin Rheumatol. 2007 Oct;21(5):885-906 - PubMed
- AMIA Annu Symp Proc. 2010 Nov 13;2010:907-11 - PubMed
- Nat Biotechnol. 2010 May;28(5):495-501 - PubMed
- Cell Mol Life Sci. 2007 Jan;64(2):155-70 - PubMed
- Clin Interv Aging. 2008;3(3):431-44 - PubMed
- Bioinformatics. 2005 Sep 15;21(18):3587-95 - PubMed
- Summit Transl Bioinform. 2009 Mar 01;2009:56-60 - PubMed
- Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50 - PubMed
- Best Pract Res Clin Rheumatol. 2009 Feb;23(1):71-82 - PubMed
- Nucleic Acids Res. 2009 Jul;37(Web Server issue):W170-3 - PubMed
- Proc Natl Acad Sci U S A. 2007 May 22;104(21):8685-90 - PubMed
- Vasc Health Risk Manag. 2008;4(3):605-14 - PubMed
- AMIA Annu Symp Proc. 2010 Nov 13;2010:797-801 - PubMed
- BMC Musculoskelet Disord. 2010 Oct 25;11:247 - PubMed
MeSH terms
Publication Types
Grant support