Display options
Share it on

Front Microbiol. 2017 Nov 15;8:2224. doi: 10.3389/fmicb.2017.02224. eCollection 2017.

Microbiome Datasets Are Compositional: And This Is Not Optional.

Frontiers in microbiology

Gregory B Gloor, Jean M Macklaim, Vera Pawlowsky-Glahn, Juan J Egozcue

Affiliations

  1. Department of Biochemistry, University of Western Ontario, London, ON, Canada.
  2. Departments of Computer Science, Applied Mathematics, and Statistics, Universitat de Girona, Girona, Spain.
  3. Department of Applied Mathematics, Universitat Politècnica de Catalunya, Barcelona, Spain.

PMID: 29187837 PMCID: PMC5695134 DOI: 10.3389/fmicb.2017.02224

Abstract

Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.

Keywords: Bayesian estimation; compositional data; correlation; count normalization; high-throughput sequencing; microbiota; relative abundance

References

  1. Genome Biol. 2010;11(3):R25 - PubMed
  2. ISME J. 2011 Feb;5(2):169-72 - PubMed
  3. Genome Biol. 2010;11(10):R106 - PubMed
  4. PLoS Comput Biol. 2012;8(9):e1002687 - PubMed
  5. PLoS One. 2013 Apr 22;8(4):e61217 - PubMed
  6. PLoS One. 2013 Jul 02;8(7):e67019 - PubMed
  7. Microbiome. 2013 Apr 12;1(1):12 - PubMed
  8. PLoS Comput Biol. 2014 Apr 03;10(4):e1003531 - PubMed
  9. Microbiome. 2014 May 05;2:15 - PubMed
  10. Proc Natl Acad Sci U S A. 2014 Jun 10;111(23):E2376-83 - PubMed
  11. PLoS Comput Biol. 2015 Mar 16;11(3):e1004075 - PubMed
  12. PLoS Comput Biol. 2015 May 07;11(5):e1004226 - PubMed
  13. Microb Ecol Health Dis. 2015 May 29;26:27663 - PubMed
  14. Sci Rep. 2015 Sep 21;5:14174 - PubMed
  15. Theory Biosci. 2016 Jun;135(1-2):21-36 - PubMed
  16. ISME J. 2016 Jul;10(7):1669-81 - PubMed
  17. Microbiome. 2016 Apr 12;4:15 - PubMed
  18. Ann Epidemiol. 2016 May;26(5):322-9 - PubMed
  19. Ann Epidemiol. 2016 May;26(5):311-21 - PubMed
  20. Ann Epidemiol. 2016 May;26(5):330-5 - PubMed
  21. Can J Microbiol. 2016 Aug;62(8):692-703 - PubMed
  22. PLoS One. 2016 Sep 15;11(9):e0161196 - PubMed
  23. Microbiome. 2016 Nov 25;4(1):62 - PubMed
  24. Elife. 2017 Feb 15;6: - PubMed
  25. mSystems. 2017 Feb 21;2(1): - PubMed
  26. Microbiome. 2017 Mar 3;5(1):27 - PubMed
  27. mSphere. 2017 Sep 27;2(5):null - PubMed
  28. Brief Bioinform. 2017 Aug 22;:null - PubMed
  29. Sci Rep. 2017 Nov 24;7(1):16252 - PubMed

Publication Types