Display options
Share it on

BMC Proc. 2007;1:S24. doi: 10.1186/1753-6561-1-s1-s24. Epub 2007 Dec 18.

Efficiency of multiple imputation to test for association in the presence of missing data.

BMC proceedings

Pascal Croiseau, Claire Bardel, Emmanuelle Génin

Affiliations

  1. Universite Paris-Sud UMR-S535, Villejuif, 94817 France. [email protected]

PMID: 18466521 PMCID: PMC2367517 DOI: 10.1186/1753-6561-1-s1-s24

Abstract

The presence of missing data in association studies is an important problem, particularly with high-density single-nucleotide polymorphism (SNP) maps, because the probability that at least one genotype is missing dramatically increases with the number of markers. A possible strategy is to simply ignore the missing data and only use the complete observations, and, consequently, to accept a significant decrease of the sample size. Using Genetic Analysis Workshop 15 simulated data on which we removed some genotypes to generate different levels of missing data, we show that this strategy might lead to an important loss in power to detect association, but may also result in false conclusions regarding the most likely susceptibility site if another marker is in linkage disequilibrium with the disease susceptibility site. We propose a multiple imputation approach to deal with missing data on case-parent trios and evaluated the performance of this approach on the same simulated data. We found that our multiple imputation approach has high power to detect association with the susceptibility site even with a large amount of missing data, and can identify the susceptibility sites among a set of sites in linkage disequilibrium.

References

  1. Am J Hum Genet. 2000 Jun;66(6):2009-12 - PubMed
  2. Biometrics. 1991 Mar;47(1):53-61 - PubMed
  3. Am J Hum Genet. 2002 Jan;70(1):124-41 - PubMed
  4. Hum Hered. 2007;63(3-4):229-38 - PubMed
  5. Stat Methods Med Res. 1999 Mar;8(1):3-15 - PubMed
  6. Genet Epidemiol. 1996;13(5):423-49 - PubMed

Publication Types