Display options
Share it on

PeerJ. 2016 Mar 29;4:e1839. doi: 10.7717/peerj.1839. eCollection 2016.

Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies.

PeerJ

Tom O Delmont, A Murat Eren

Affiliations

  1. Department of Medicine, University of Chicago , Chicago, IL , United States.
  2. Department of Medicine, University of Chicago, Chicago, IL, United States; Josephine Bay Paul Center, Marine Biological Laboratory, Woods Hole, MA, United States.

PMID: 27069789 PMCID: PMC4824900 DOI: 10.7717/peerj.1839

Abstract

High-throughput sequencing provides a fast and cost-effective mean to recover genomes of organisms from all domains of life. However, adequate curation of the assembly results against potential contamination of non-target organisms requires advanced bioinformatics approaches and practices. Here, we re-analyzed the sequencing data generated for the tardigrade Hypsibius dujardini, and created a holistic display of the eukaryotic genome assembly using DNA data originating from two groups and eleven sequencing libraries. By using bacterial single-copy genes, k-mer frequencies, and coverage values of scaffolds we could identify and characterize multiple near-complete bacterial genomes from the raw assembly, and curate a 182 Mbp draft genome for H. dujardini supported by RNA-Seq data. Our results indicate that most contaminant scaffolds were assembled from Moleculo long-read libraries, and most of these contaminants have differed between library preparations. Our re-analysis shows that visualization and curation of eukaryotic genome assemblies can benefit from tools designed to address the needs of today's microbiologists, who are constantly challenged by the difficulties associated with the identification of distinct microbial genomes in complex environmental metagenomes.

Keywords: Assembly; Contamination; Curation; Genomics; HGT; Visualization

References

  1. Annu Rev Microbiol. 1989;43:567-600 - PubMed
  2. Nature. 2010 Mar 25;464(7288):592-6 - PubMed
  3. Nat Methods. 2012 Mar 04;9(4):357-9 - PubMed
  4. PLoS One. 2014 May 16;9(5):e97876 - PubMed
  5. Int J Radiat Biol. 2005 Sep;81(9):649-56 - PubMed
  6. Genome Res. 2003 Feb;13(2):145-58 - PubMed
  7. Proc Natl Acad Sci U S A. 2015 Dec 29;112(52):15976-81 - PubMed
  8. Immunity. 2013 Aug 22;39(2):372-85 - PubMed
  9. Environ Microbiol. 2015 Jul;17(7):2203-8 - PubMed
  10. Appl Environ Microbiol. 2013 Nov;79(22):6868-73 - PubMed
  11. Front Genet. 2013 Nov 29;4:237 - PubMed
  12. PeerJ. 2014 Nov 20;2:e675 - PubMed
  13. Nature. 2015 Jul 9;523(7559):208-11 - PubMed
  14. Proc Natl Acad Sci U S A. 2016 May 3;113(18):5053-8 - PubMed
  15. Integr Comp Biol. 2015 Aug;55(2):241-52 - PubMed
  16. BMC Biol. 2014 Nov 12;12:87 - PubMed
  17. Genome Biol. 2009;10(8):R85 - PubMed
  18. Nature. 2016 Mar 31;531(7596):637-41 - PubMed
  19. Curr Biol. 2008 Sep 9;18(17):R729-R731 - PubMed
  20. Bioinformatics. 2009 Aug 15;25(16):2078-9 - PubMed
  21. Genome Biol. 2008 Oct 13;9(10):R151 - PubMed
  22. PeerJ. 2015 Aug 27;3:e1165 - PubMed
  23. Proc Natl Acad Sci U S A. 2011 Jan 25;108(4):1513-8 - PubMed
  24. Nat Biotechnol. 2016 Jan;34(1):64-9 - PubMed
  25. PLoS Biol. 2007 Mar;5(3):e77 - PubMed
  26. Microbiome. 2014 Aug 01;2:26 - PubMed
  27. Nature. 2004 Mar 4;428(6978):37-43 - PubMed
  28. Nat Rev Microbiol. 2005 Jun;3(6):479-88 - PubMed
  29. BMC Genomics. 2008 Feb 08;9:75 - PubMed
  30. PeerJ. 2015 Oct 08;3:e1319 - PubMed
  31. Genome Res. 2015 Apr;25(4):534-43 - PubMed
  32. Genome Biol. 2011 Nov 08;12(11):R112 - PubMed
  33. Front Microbiol. 2015 Apr 30;6:358 - PubMed
  34. PLoS One. 2013 Jun 06;8(6):e64793 - PubMed
  35. Bioinform Biol Insights. 2013 Feb 24;7:55-72 - PubMed
  36. Nat Biotechnol. 2013 Jun;31(6):533-8 - PubMed
  37. Microbiol Mol Biol Rev. 2008 Dec;72(4):686-727 - PubMed
  38. PLoS Comput Biol. 2011 Oct;7(10):e1002195 - PubMed
  39. BMC Bioinformatics. 2015 Apr 28;16:130 - PubMed
  40. Proc Natl Acad Sci U S A. 2013 Apr 2;110(14):5540-5 - PubMed
  41. Science. 2001 Feb 16;291(5507):1304-51 - PubMed
  42. PLoS One. 2013 Jun 17;8(6):e66643 - PubMed
  43. Science. 2005 Aug 26;309(5739):1387-90 - PubMed
  44. Nat Rev Microbiol. 2015 Dec;13(12 ):787-94 - PubMed
  45. Evol Appl. 2014 Nov;7(9):1026-42 - PubMed
  46. Genome Res. 2015 Jul;25(7):1043-55 - PubMed
  47. PLoS Pathog. 2014 Nov 20;10(11):e1004437 - PubMed
  48. Nat Methods. 2014 Nov;11(11):1144-6 - PubMed
  49. Genome Biol. 2011;12(5):R44 - PubMed
  50. Curr Opin Biotechnol. 2003 Jun;14(3):303-10 - PubMed
  51. Environ Microbiol. 2004 Sep;6(9):938-47 - PubMed

Publication Types