Display options
Share it on

Appl Plant Sci. 2016 Jul 12;4(7). doi: 10.3732/apps.1600016. eCollection 2016 Jul.

HybPiper: Extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment.

Applications in plant sciences

Matthew G Johnson, Elliot M Gardner, Yang Liu, Rafael Medina, Bernard Goffinet, A Jonathan Shaw, Nyree J C Zerega, Norman J Wickett

Affiliations

  1. Chicago Botanic Garden, 1000 Lake Cook Road, Glencoe, Illinois 60022 USA.
  2. Chicago Botanic Garden, 1000 Lake Cook Road, Glencoe, Illinois 60022 USA; Plant Biology and Conservation, Northwestern University, 2205 Tech Drive, Evanston, Illinois 60208 USA.
  3. Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Storrs, Connecticut 06269 USA.
  4. Department of Biology, Duke University, Box 90338, Durham, North Carolina 27708 USA.

PMID: 27437175 PMCID: PMC4948903 DOI: 10.3732/apps.1600016

Abstract

PREMISE OF THE STUDY: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae).

METHODS AND RESULTS: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus.

CONCLUSIONS: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper.

Keywords: Hyb-Seq; bioinformatics; phylogenomics; sequence assembly

References

  1. BMC Bioinformatics. 2005 Feb 15;6:31 - PubMed
  2. Nat Biotechnol. 2009 Feb;27(2):182-9 - PubMed
  3. Bioinformatics. 2009 Jun 1;25(11):1422-3 - PubMed
  4. Bioinformatics. 2009 Jul 15;25(14):1754-60 - PubMed
  5. Bioinformatics. 2009 Aug 15;25(16):2078-9 - PubMed
  6. Bioinformatics. 2009 Aug 1;25(15):1972-3 - PubMed
  7. BMC Bioinformatics. 2009 Dec 15;10:421 - PubMed
  8. Nat Methods. 2010 Feb;7(2):111-8 - PubMed
  9. BMC Genomics. 2011 May 04;12:211 - PubMed
  10. Syst Biol. 2012 Oct;61(5):717-26 - PubMed
  11. Am J Bot. 2012 Feb;99(2):291-311 - PubMed
  12. J Comput Biol. 2012 May;19(5):455-77 - PubMed
  13. Syst Biol. 2012 Oct;61(5):727-44 - PubMed
  14. BMC Genomics. 2012 Aug 17;13:403 - PubMed
  15. Mol Biol Evol. 2013 Apr;30(4):772-80 - PubMed
  16. Nature. 2013 May 16;497(7449):327-31 - PubMed
  17. Syst Biol. 2014 Jan 1;63(1):83-95 - PubMed
  18. Nat Commun. 2013;4:2445 - PubMed
  19. Bioinformatics. 2014 May 1;30(9):1312-3 - PubMed
  20. Mol Ecol Resour. 2014 Nov;14(6):1103-13 - PubMed
  21. Bioinformatics. 2014 Aug 1;30(15):2114-20 - PubMed
  22. Bioinformatics. 2014 Sep 1;30(17):i541-8 - PubMed
  23. Appl Plant Sci. 2013 Jan 31;1(2):null - PubMed
  24. Appl Plant Sci. 2014 Feb 06;2(2):null - PubMed
  25. Appl Plant Sci. 2014 Aug 29;2(9):null - PubMed
  26. BMC Evol Biol. 2015 Apr 11;15:62 - PubMed
  27. Syst Biol. 2015 Sep;64(5):727-40 - PubMed
  28. Am J Bot. 2015 Jun;102(6):910-20 - PubMed
  29. Mol Ecol Resour. 2016 Sep;16(5):1059-68 - PubMed
  30. Appl Plant Sci. 2015 Aug 14;3(8):null - PubMed
  31. Mol Biol Evol. 2016 Jan;33(1):281-94 - PubMed
  32. Bioinformatics. 2016 Mar 1;32(5):786-8 - PubMed
  33. Proc Biol Sci. 2016 Jan 13;283(1822): - PubMed
  34. Syst Biol. 2016 Jul;65(4):640-50 - PubMed
  35. Appl Plant Sci. 2016 Jul 13;4(7):null - PubMed

Publication Types