Display options
Share it on

Front Genet. 2016 Feb 16;7:15. doi: 10.3389/fgene.2016.00015. eCollection 2016.

Estimating Effect Sizes and Expected Replication Probabilities from GWAS Summary Statistics.

Frontiers in genetics

Dominic Holland, Yunpeng Wang, Wesley K Thompson, Andrew Schork, Chi-Hua Chen, Min-Tzu Lo, Aree Witoelar, Thomas Werge, Michael O'Donovan, Ole A Andreassen, Anders M Dale

Affiliations

  1. Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Neurosciences, University of CaliforniaSan Diego, La Jolla, CA, USA.
  2. Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Neurosciences, University of CaliforniaSan Diego, La Jolla, CA, USA; NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of OsloOslo, Norway; Division of Mental Health and Addiction, Oslo University HospitalOslo, Norway.
  3. Department of Psychiatry, University of California San Diego, La Jolla, CA, USA.
  4. Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Cognitive Sciences, University of CaliforniaSan Diego, La Jolla, CA, USA.
  5. Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Radiology, University of CaliforniaSan Diego, La Jolla, CA, USA.
  6. NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of OsloOslo, Norway; Division of Mental Health and Addiction, Oslo University HospitalOslo, Norway.
  7. Institute of Biological Psychiatry, MHC, Sct. Hans Hospital and University of Copenhagen Copenhagen, Denmark.
  8. MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University Cardiff, UK.
  9. Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Neurosciences, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Psychiatry, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Radiology, University of CaliforniaSan Diego, La Jolla, CA, USA.

PMID: 26909100 PMCID: PMC4754432 DOI: 10.3389/fgene.2016.00015

Abstract

Genome-wide Association Studies (GWAS) result in millions of summary statistics ("z-scores") for single nucleotide polymorphism (SNP) associations with phenotypes. These rich datasets afford deep insights into the nature and extent of genetic contributions to complex phenotypes such as psychiatric disorders, which are understood to have substantial genetic components that arise from very large numbers of SNPs. The complexity of the datasets, however, poses a significant challenge to maximizing their utility. This is reflected in a need for better understanding the landscape of z-scores, as such knowledge would enhance causal SNP and gene discovery, help elucidate mechanistic pathways, and inform future study design. Here we present a parsimonious methodology for modeling effect sizes and replication probabilities, relying only on summary statistics from GWAS substudies, and a scheme allowing for direct empirical validation. We show that modeling z-scores as a mixture of Gaussians is conceptually appropriate, in particular taking into account ubiquitous non-null effects that are likely in the datasets due to weak linkage disequilibrium with causal SNPs. The four-parameter model allows for estimating the degree of polygenicity of the phenotype and predicting the proportion of chip heritability explainable by genome-wide significant SNPs in future studies with larger sample sizes. We apply the model to recent GWAS of schizophrenia (N = 82,315) and putamen volume (N = 12,596), with approximately 9.3 million SNP z-scores in both cases. We show that, over a broad range of z-scores and sample sizes, the model accurately predicts expectation estimates of true effect sizes and replication probabilities in multistage GWAS designs. We assess the degree to which effect sizes are over-estimated when based on linear-regression association coefficients. We estimate the polygenicity of schizophrenia to be 0.037 and the putamen to be 0.001, while the respective sample sizes required to approach fully explaining the chip heritability are 10(6) and 10(5). The model can be extended to incorporate prior knowledge such as pleiotropy and SNP annotation. The current findings suggest that the model is applicable to a broad array of complex phenotypes and will enhance understanding of their genetic architectures.

Keywords: GWAS; Gaussian mixture model; SNP discovery; effect size; heritability; putamen; schizophrenia

References

  1. Biometrics. 1999 Dec;55(4):997-1004 - PubMed
  2. Genet Epidemiol. 2007 Nov;31(7):776-88 - PubMed
  3. Nat Genet. 2011 Sep 18;43(10):969-76 - PubMed
  4. Nat Genet. 2013 Oct;45(10):1150-9 - PubMed
  5. Stat Sci. 2009 Nov 1;24(4):414-429 - PubMed
  6. Nat Genet. 2010 Jul;42(7):565-9 - PubMed
  7. Nat Genet. 2011 Sep 18;43(10):977-83 - PubMed
  8. Am J Hum Genet. 2008 Feb;82(2):375-85 - PubMed
  9. Biometrics. 2004 Sep;60(3):589-97 - PubMed
  10. Proc Natl Acad Sci U S A. 2011 Nov 1;108(44):18026-31 - PubMed
  11. Nature. 2014 Jul 24;511(7510):421-7 - PubMed
  12. Nat Genet. 2015 Mar;47(3):291-5 - PubMed
  13. PLoS Genet. 2013 Apr;9(4):e1003455 - PubMed
  14. Genet Epidemiol. 2011 Sep;35(6):447-56 - PubMed
  15. Nature. 2015 Apr 9;520(7546):224-9 - PubMed
  16. Nat Genet. 2012 Feb 19;44(3):247-50 - PubMed
  17. Genet Epidemiol. 2012 Apr;36(3):214-24 - PubMed
  18. Front Genet. 2012 Jul 02;3:118 - PubMed
  19. Lancet. 2009 Jan 17;373(9659):234-9 - PubMed
  20. Genet Epidemiol. 2008 Apr;32(3):227-34 - PubMed
  21. Genet Epidemiol. 2007 Dec;31(8):871-82 - PubMed
  22. PLoS Genet. 2015 Dec 29;11(12):e1005717 - PubMed
  23. Nat Rev Genet. 2013 Feb;14(2):139-49 - PubMed
  24. Am J Hum Genet. 2013 Feb 7;92(2):197-209 - PubMed
  25. Am J Hum Genet. 2002 Jun;70(6):1480-9 - PubMed
  26. PLoS Genet. 2013;9(2):e1003264 - PubMed
  27. Am J Hum Genet. 2011 Mar 11;88(3):294-305 - PubMed
  28. J Dairy Sci. 2012 Jul;95(7):4114-29 - PubMed
  29. Nat Genet. 2012 Mar 25;44(5):483-9 - PubMed
  30. PLoS Genet. 2016 Jan 25;12(1):e1005803 - PubMed
  31. Nat Genet. 2013 Apr;45(4):400-5, 405e1-3 - PubMed
  32. Am J Hum Genet. 2015 Aug 6;97(2):250-9 - PubMed
  33. PLoS One. 2010 Nov 17;5(11):e13898 - PubMed
  34. Nat Methods. 2014 Apr;11(4):407-9 - PubMed
  35. Mol Psychiatry. 2013 Apr;18(4):497-511 - PubMed
  36. Nat Genet. 2013 Dec;45(12):1452-8 - PubMed
  37. Genetics. 2001 Apr;157(4):1819-29 - PubMed
  38. Nat Rev Genet. 2014 Nov;15(11):765-76 - PubMed
  39. Eur J Hum Genet. 2011 Jul;19(7):807-12 - PubMed
  40. Genet Epidemiol. 2008 May;32(4):381-5 - PubMed
  41. Epidemiology. 2013 Jan;24(1):62-8 - PubMed
  42. PLoS Genet. 2013 Apr;9(4):e1003449 - PubMed
  43. Nature. 2010 Oct 28;467(7319):1061-73 - PubMed
  44. Am J Hum Genet. 2011 Jan 7;88(1):76-82 - PubMed
  45. Am J Hum Genet. 2008 May;82(5):1064-74 - PubMed
  46. Am J Hum Genet. 2007 Apr;80(4):605-15 - PubMed
  47. Am J Hum Genet. 2012 Jan 13;90(1):7-24 - PubMed
  48. Neuron. 2010 Oct 21;68(2):182-6 - PubMed
  49. Nature. 2009 Aug 6;460(7256):748-52 - PubMed
  50. Nat Genet. 2010 Jul;42(7):570-5 - PubMed
  51. Genome Res. 2014 Sep;24(9):1550-7 - PubMed
  52. Arch Gen Psychiatry. 2003 Dec;60(12):1187-92 - PubMed

Publication Types

Grant support