Display options
Share it on

Front Genet. 2016 Jun 15;7:97. doi: 10.3389/fgene.2016.00097. eCollection 2016.

A Predictive Based Regression Algorithm for Gene Network Selection.

Frontiers in genetics

Stéphane Guerrier, Nabil Mili, Roberto Molinari, Samuel Orso, Marco Avella-Medina, Yanyuan Ma

Affiliations

  1. Department of Statistics, University of Illinois at Urbana-Champaign Champaign, IL, USA.
  2. Research Center for Statistics, Geneva School of Economics and Management, University of Geneva Geneva, Switzerland.
  3. Department of Statistics, University of South Carolina Columbia, SC, USA.

PMID: 27379155 PMCID: PMC4908120 DOI: 10.3389/fgene.2016.00097

Abstract

Gene selection has become a common task in most gene expression studies. The objective of such research is often to identify the smallest possible set of genes that can still achieve good predictive performance. To do so, many of the recently proposed classification methods require some form of dimension-reduction of the problem which finally provide a single model as an output and, in most cases, rely on the likelihood function in order to achieve variable selection. We propose a new prediction-based objective function that can be tailored to the requirements of practitioners and can be used to assess and interpret a given problem. Based on cross-validation techniques and the idea of importance sampling, our proposal scans low-dimensional models under the assumption of sparsity and, for each of them, estimates their objective function to assess their predictive power in order to select. Two applications on cancer data sets and a simulation study show that the proposal compares favorably with competing alternatives such as, for example, Elastic Net and Support Vector Machine. Indeed, the proposed method not only selects smaller models for better, or at least comparable, classification errors but also provides a set of selected models instead of a single one, allowing to construct a network of possible models for a target prediction accuracy level.

Keywords: acute leukemia; biomarker selection; breast cancer; disease classification; genomic networks; model averaging

References

  1. J Biol Chem. 1992 May 5;267(13):9210-3 - PubMed
  2. Biochim Biophys Acta. 2015 Feb;1853(2):388-95 - PubMed
  3. Mol Genet Metab. 2015 Mar;114(3):397-402 - PubMed
  4. J Stat Softw. 2010;33(1):1-22 - PubMed
  5. Proc Natl Acad Sci U S A. 2012 Feb 21;109(8):2802-7 - PubMed
  6. Cancer Cell. 2006 Dec;10(6):529-41 - PubMed
  7. Leukemia. 2015 Apr;29(4):776-82 - PubMed
  8. Biostatistics. 2004 Jul;5(3):427-43 - PubMed
  9. Biostatistics. 2008 Jan;9(1):30-50 - PubMed
  10. Proc Natl Acad Sci U S A. 2002 May 14;99(10):6567-72 - PubMed
  11. Bioinformatics. 2010 Jan 15;26(2):215-22 - PubMed
  12. Obes Res. 2003 Jun;11(6):699-708 - PubMed
  13. BMC Bioinformatics. 2006 Jan 06;7:3 - PubMed
  14. Bioinformatics. 2010 Jun 15;26(12):i237-45 - PubMed
  15. Biochem J. 2012 Feb 1;441(3):889-99 - PubMed
  16. Immunol Cell Biol. 2015 May-Jun;93(5):442-51 - PubMed
  17. Stat Med. 2007 Feb 20;26(4):919-30 - PubMed
  18. Science. 1999 Oct 15;286(5439):531-7 - PubMed

Publication Types