Display options
Share it on

Front Genet. 2014 Apr 29;5:95. doi: 10.3389/fgene.2014.00095. eCollection 2014.

Using previously genotyped controls in genome-wide association studies (GWAS): application to the Stroke Genetics Network (SiGN).

Frontiers in genetics

Braxton D Mitchell, Myriam Fornage, Patrick F McArdle, Yu-Ching Cheng, Sara L Pulit, Quenna Wong, Tushar Dave, Stephen R Williams, Roderick Corriveau, Katrina Gwinn, Kimberly Doheny, Cathy C Laurie, Stephen S Rich, Paul I W de Bakker,

Affiliations

  1. Department of Medicine and Program for Personalized and Genomic Medicine, University of Maryland School of Medicine Baltimore, MD, USA ; Veterans Administration Medical Center Baltimore, MD, USA.
  2. Department of Medicine, University of Texas Health Science Center Houston, TX, USA.
  3. Department of Medicine and Program for Personalized and Genomic Medicine, University of Maryland School of Medicine Baltimore, MD, USA.
  4. Department of Medical Genetics, University Medical Center Utrecht Utrecht, Netherlands.
  5. Department of Biostatistics, University of Washington Seattle, WA, USA.
  6. School of Medicine, Center for Public Health Genomics, University of Virginia Charlottesville, VA, USA ; School of Medicine, Cardiovascular Research Center, University of Virginia Charlottesville, VA, USA.
  7. National Institute of Neurological Disorders and Stroke Bethesda, MD, USA.
  8. Center for Inherited Disease Research, Institute of Genetic Medicine, Johns Hopkins University School of Medicine Baltimore, MD, USA.
  9. School of Medicine, Center for Public Health Genomics, University of Virginia Charlottesville, VA, USA.
  10. Department of Medical Genetics, University Medical Center Utrecht Utrecht, Netherlands ; Department of Epidemiology, University Medical Center Utrecht Utrecht, Netherlands.

PMID: 24808905 PMCID: PMC4010766 DOI: 10.3389/fgene.2014.00095

Abstract

Genome-wide association studies (GWAS) are widely applied to identify susceptibility loci for a variety of diseases using genotyping arrays that interrogate known polymorphisms throughout the genome. A particular strength of GWAS is that it is unbiased with respect to specific genomic elements (e.g., coding or regulatory regions of genes), and it has revealed important associations that would have never been suspected based on prior knowledge or assumptions. To date, the discovered SNPs associated with complex human traits tend to have small effect sizes, requiring very large sample sizes to achieve robust statistical power. To address these issues, a number of efficient strategies have emerged for conducting GWAS, including combining study results across multiple studies using meta-analysis, collecting cases through electronic health records, and using samples collected from other studies as controls that have already been genotyped and made publicly available (e.g., through deposition of de-identified data into dbGaP or EGA). In certain scenarios, it may be attractive to use already genotyped controls and divert resources to standardized collection, phenotyping, and genotyping of cases only. This strategy, however, requires that careful attention be paid to the choice of "public controls" and to the comparability of genetic data between cases and the public controls to ensure that any allele frequency differences observed between groups is attributable to locus-specific effects rather than to a systematic bias due to poor matching (population stratification) or differential genotype calling (batch effects). The goal of this paper is to describe some of the potential pitfalls in using previously genotyped control data. We focus on considerations related to the choice of control groups, the use of different genotyping platforms, and approaches to deal with population stratification when cases and controls are genotyped across different platforms.

Keywords: case-control study; genetic association study; genome-wide association study; population stratification; power

References

  1. Annu Rev Genomics Hum Genet. 2013;14:441-65 - PubMed
  2. Bioinformatics. 2012 Oct 1;28(19):2543-5 - PubMed
  3. Proc Natl Acad Sci U S A. 2009 Jun 9;106(23):9362-7 - PubMed
  4. Nat Rev Genet. 2010 Jul;11(7):459-63 - PubMed
  5. Stroke. 2007 Nov;38(11):2979-84 - PubMed
  6. Circulation. 2014 Jan 21;129(3):e28-e292 - PubMed
  7. Genet Epidemiol. 2010 Sep;34(6):591-602 - PubMed
  8. Biometrics. 1999 Dec;55(4):997-1004 - PubMed
  9. Ann N Y Acad Sci. 2010 Nov;1212:59-77 - PubMed
  10. J Musculoskelet Neuronal Interact. 2008 Oct-Dec;8(4):313-4 - PubMed
  11. Am J Hum Genet. 2010 Apr 9;86(4):560-72 - PubMed
  12. Stroke. 2013 Oct;44(10):2694-702 - PubMed
  13. Neurology. 2009 Jun 9;72(23):2029-35 - PubMed
  14. Nat Genet. 2007 Oct;39(10):1181-6 - PubMed
  15. Nature. 2010 Aug 5;466(7307):707-13 - PubMed
  16. Hum Mol Genet. 2012 Dec 15;21(24):5329-43 - PubMed
  17. Bioinformatics. 2010 Nov 15;26(22):2867-73 - PubMed
  18. Am J Epidemiol. 2012 Jun 15;175(12):1303-10 - PubMed

Publication Types

Grant support