Display options
Share it on

Ann Appl Stat. 2011 Sep 01;5(3):1978-2002. doi: 10.1214/11-AOAS463.

INCORPORATING BIOLOGICAL INFORMATION INTO LINEAR MODELS: A BAYESIAN APPROACH TO THE SELECTION OF PATHWAYS AND GENES.

The annals of applied statistics

Francesco C Stingo, Yian A Chen, Mahlet G Tadesse, Marina Vannucci

Affiliations

  1. Department of Statistics Rice University Houston, Texas 77005 USA [email protected].

PMID: 23667412 PMCID: PMC3650864 DOI: 10.1214/11-AOAS463

Abstract

The vast amount of biological knowledge accumulated over the years has allowed researchers to identify various biochemical interactions and define different families of pathways. There is an increased interest in identifying pathways and pathway elements involved in particular biological processes. Drug discovery efforts, for example, are focused on identifying biomarkers as well as pathways related to a disease. We propose a Bayesian model that addresses this question by incorporating information on pathways and gene networks in the analysis of DNA microarray data. Such information is used to define pathway summaries, specify prior distributions, and structure the MCMC moves to fit the model. We illustrate the method with an application to gene expression data with censored survival outcomes. In addition to identifying markers that would have been missed otherwise and improving prediction accuracy, the integration of existing biological knowledge into the analysis provides a better understanding of underlying molecular processes.

Keywords: Bayesian variable selection; Markov chain Monte Carlo; Markov random field prior; gene expression; pathway selection

References

  1. Genome Biol. 2003;4(1):R7 - PubMed
  2. Cancer Inform. 2007 Feb 05;3:19-28 - PubMed
  3. Bioinformatics. 2007 Jun 15;23(12):1537-44 - PubMed
  4. Nat Genet. 2002 May;31(1):19-20 - PubMed
  5. Biochem Biophys Res Commun. 2007 Mar 2;354(1):165-71 - PubMed
  6. Cell. 2006 Aug 11;126(3):489-502 - PubMed
  7. Clin Breast Cancer. 2004 Feb;4(6):428-33 - PubMed
  8. Proc Natl Acad Sci U S A. 2004 Jun 1;101(22):8431-6 - PubMed
  9. Stat Med. 1992 Oct-Nov;11(14-15):1871-9 - PubMed
  10. Nucleic Acids Res. 2000 Jan 1;28(1):27-30 - PubMed
  11. Cancer Res. 2008 Aug 1;68(15):6092-9 - PubMed
  12. Biometrics. 2010 Jun;66(2):474-84 - PubMed
  13. Bioinformatics. 2001 Jun;17(6):520-5 - PubMed
  14. Genome Inform Ser Workshop Genome Inform. 1999;10:94-103 - PubMed
  15. Bioinformatics. 2011 Feb 15;27(4):495-501 - PubMed
  16. Brief Bioinform. 2007 Jan;8(1):32-44 - PubMed
  17. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D428-32 - PubMed
  18. Nature. 2006 Jan 19;439(7074):274-5 - PubMed
  19. Nature. 2006 Jan 19;439(7074):353-7 - PubMed
  20. Nature. 2002 Jan 31;415(6871):530-6 - PubMed
  21. Ann Appl Stat. 2011 Sep 1;5(3):1978-2002 - PubMed
  22. Nat Genet. 2000 May;25(1):25-9 - PubMed
  23. Biometrics. 2004 Sep;60(3):812-9 - PubMed
  24. Biostatistics. 2007 Apr;8(2):212-27 - PubMed
  25. Bioinformatics. 2009 Jun 1;25(11):1470-1 - PubMed
  26. Mol Cancer. 2010 Apr 14;9:76 - PubMed
  27. Nat Med. 2002 Jan;8(1):68-74 - PubMed
  28. Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50 - PubMed
  29. Bioinformatics. 2006 Sep 15;22(18):2262-8 - PubMed
  30. Bioinformatics. 2008 May 1;24(9):1175-82 - PubMed
  31. Breast Cancer Res Treat. 2007 Aug;104(2):165-79 - PubMed
  32. Nature. 2007 Apr 12;446(7137):765-70 - PubMed
  33. Science. 1999 Oct 15;286(5439):531-7 - PubMed
  34. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D438-42 - PubMed

Publication Types

Grant support