Display options
Share it on

Hum Hered. 2012;73(1):18-25. doi: 10.1159/000334084. Epub 2011 Dec 30.

Performance of genotype imputations using data from the 1000 Genomes Project.

Human heredity

Yun Ju Sung, Lihua Wang, Tuomo Rankinen, Claude Bouchard, D C Rao

Affiliations

  1. Division of Biostatistics, School of Medicine, Washington University in St. Louis, Mo. 63110, USA. [email protected]

PMID: 22212296 PMCID: PMC3322630 DOI: 10.1159/000334084

Abstract

Genotype imputations based on 1000 Genomes (1KG) Project data have the advantage of imputing many more SNPs than imputations based on HapMap data. It also provides an opportunity to discover associations with relatively rare variants. Recent investigations are increasingly using 1KG data for genotype imputations, but only limited evaluations of the performance of this approach are available. In this paper, we empirically evaluated imputation performance using 1KG data by comparing imputation results to those using the HapMap Phase II data that have been widely used. We used three reference panels: the CEU panel consisting of 120 haplotypes from HapMap II and 1KG data (June 2010 release) and the EUR panel consisting of 566 haplotypes also from 1KG data (August 2010 release). We used Illumina 324,607 autosomal SNPs genotyped in 501 individuals of European ancestry. Our most important finding was that both 1KG reference panels provided much higher imputation yield than the HapMap II panel. There were more than twice as many successfully imputed SNPs as there were using the HapMap II panel (6.7 million vs. 2.5 million). Our second most important finding was that accuracy using both 1KG panels was high and almost identical to accuracy using the HapMap II panel. Furthermore, after removing SNPs with MACH Rsq <0.3, accuracy for both rare and low frequency SNPs was very high and almost identical to accuracy for common SNPs. We found that imputation using the 1KG-EUR panel had advantages in successfully imputing rare, low frequency and common variants. Our findings suggest that 1KG-based imputation can increase the opportunity to discover significant associations for SNPs across the allele frequency spectrum. Because the 1KG Project is still underway, we expect that later versions will provide even better imputation performance.

Copyright © 2011 S. Karger AG, Basel.

References

  1. J Appl Physiol (1985). 2011 May;110(5):1160-70 - PubMed
  2. PLoS Genet. 2007 Jul;3(7):e114 - PubMed
  3. Am J Hum Genet. 2007 Nov;81(5):1084-97 - PubMed
  4. Nature. 2010 Oct 28;467(7319):1061-73 - PubMed
  5. PLoS One. 2008;3(10):e3551 - PubMed
  6. Genome Res. 2010 Sep;20(9):1297-303 - PubMed
  7. Nature. 2010 Sep 2;467(7311):52-8 - PubMed
  8. Nat Genet. 2010 Jun;42(6):495-7 - PubMed
  9. Am J Hum Genet. 2009 Feb;84(2):210-23 - PubMed
  10. PLoS One. 2010 Jun 08;5(6):e11018 - PubMed
  11. Nature. 2007 Oct 18;449(7164):851-61 - PubMed
  12. Nat Genet. 2007 Jul;39(7):906-13 - PubMed
  13. Am J Hum Genet. 2009 Feb;84(2):235-50 - PubMed
  14. Science. 2007 Jun 1;316(5829):1341-5 - PubMed
  15. Med Sci Sports Exerc. 1995 May;27(5):721-9 - PubMed
  16. Am J Hum Genet. 2008 Jul;83(1):112-9 - PubMed
  17. Hum Genet. 2008 Dec;124(5):439-50 - PubMed
  18. Eur J Hum Genet. 2011 Jun;19(6):662-6 - PubMed
  19. Proc Natl Acad Sci U S A. 2009 Jun 9;106(23):9362-7 - PubMed
  20. Genet Epidemiol. 2010 Dec;34(8):816-34 - PubMed
  21. Genome Res. 2011 Jun;21(6):952-60 - PubMed
  22. Genome Res. 2011 Jun;21(6):940-51 - PubMed
  23. Am J Hum Genet. 2010 Nov 12;87(5):728-35 - PubMed
  24. Nat Genet. 2011 May;43(5):491-8 - PubMed
  25. Nat Rev Genet. 2010 Jul;11(7):499-511 - PubMed
  26. Hum Genet. 2009 Mar;125(2):163-71 - PubMed
  27. PLoS Genet. 2009 May;5(5):e1000477 - PubMed
  28. Hum Mol Genet. 2008 Oct 15;17(R2):R122-8 - PubMed
  29. BMC Genet. 2009 Jun 16;10:27 - PubMed
  30. Am J Hum Genet. 2007 Sep;81(3):559-75 - PubMed
  31. Nat Genet. 2010 Nov;42(11):991-5 - PubMed
  32. Nat Genet. 2010 May;42(5):436-40 - PubMed

MeSH terms

Publication Types

Grant support