Display options
Share it on

Front Genet. 2015 May 19;6:172. doi: 10.3389/fgene.2015.00172. eCollection 2015.

FAST: FAST Analysis of Sequences Toolbox.

Frontiers in genetics

Travis J Lawrence, Kyle T Kauffman, Katherine C H Amrine, Dana L Carper, Raymond S Lee, Peter J Becich, Claudia J Canales, David H Ardell

Affiliations

  1. Quantitative and Systems Biology Program, University of California, Merced Merced, CA, USA.
  2. Molecular Cell Biology Unit, School of Natural Sciences, University of California, Merced Merced, CA, USA.
  3. Quantitative and Systems Biology Program, University of California, Merced Merced, CA, USA ; Department of Viticulture and Enology, University of California, Davis Davis, CA, USA.
  4. School of Engineering, University of California, Merced Merced, CA, USA.
  5. Quantitative and Systems Biology Program, University of California, Merced Merced, CA, USA ; Molecular Cell Biology Unit, School of Natural Sciences, University of California, Merced Merced, CA, USA.

PMID: 26042145 PMCID: PMC4437040 DOI: 10.3389/fgene.2015.00172

Abstract

FAST (FAST Analysis of Sequences Toolbox) provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU's Not Unix) Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R, and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics make FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format). Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.

Keywords: BioPerl; MultiFASTA; NCBI taxonomy; Unix philosophy; bioinformatic workflow; open source; pipeline; regular expression

References

  1. Nucleic Acids Res. 2015 Jan;43(Database issue):D662-9 - PubMed
  2. Science. 1985 Mar 22;227(4693):1435-41 - PubMed
  3. Genome Res. 2002 Oct;12(10):1611-8 - PubMed
  4. Methods Mol Biol. 2014;1150:21-43 - PubMed
  5. Nucleic Acids Res. 2009 Jan;37(Database issue):D26-31 - PubMed
  6. Bioinformatics. 2010 Feb 1;26(3):437-9 - PubMed
  7. F1000Res. 2014 Feb 19;3:62 - PubMed
  8. Bioinformatics. 2009 May 1;25(9):1189-91 - PubMed
  9. Biotechniques. 2000 Jun;28(6):1102, 1104 - PubMed
  10. Front Genet. 2014 May 01;5:114 - PubMed
  11. Nat Genet. 2009 Feb;41(2):149-55 - PubMed
  12. FASEB J. 2014 Sep;28(9):3847-55 - PubMed
  13. Theor Popul Biol. 1975 Apr;7(2):256-76 - PubMed
  14. Proc Natl Acad Sci U S A. 1979 Oct;76(10):5269-73 - PubMed
  15. Clin Chem. 2011 May;57(5):688-90 - PubMed
  16. Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W15-9 - PubMed
  17. Bioinformatics. 2009 Jun 1;25(11):1451-2 - PubMed
  18. Bioinformatics. 2004 Aug 12;20(12):1986-8 - PubMed
  19. Science. 2013 May 17;340(6134):814-5 - PubMed
  20. Theor Popul Biol. 1972 Mar;3(1):87-112 - PubMed
  21. Nucleic Acids Res. 2015 Jan;43(Database issue):D670-81 - PubMed
  22. Science. 2012 Apr 13;336(6078):159-60 - PubMed
  23. Trends Genet. 2000 Jun;16(6):276-7 - PubMed
  24. Genetics. 2003 Dec;165(4):1761-77 - PubMed
  25. Nucleic Acids Res. 2009 Jan;37(Database issue):D5-15 - PubMed
  26. Annu Rev Biochem. 1985;54:631-64 - PubMed
  27. Bioinformatics. 2015 Jan 1;31(1):143-5 - PubMed
  28. Genetics. 1995 Sep;141(1):413-29 - PubMed
  29. Nat Med. 2010 Jun;16(6):618 - PubMed
  30. Nucleic Acids Res. 2014 Jan;42(Database issue):D560-7 - PubMed
  31. Gigascience. 2013 Nov 13;2(1):15 - PubMed
  32. Comput Appl Biosci. 1994 Dec;10(6):671-5 - PubMed
  33. Nucleic Acids Res. 2015 Jan;43(Database issue):D1049-56 - PubMed
  34. Biostatistics. 2009 Jul;10(3):405-8 - PubMed
  35. J Biomol NMR. 1995 Nov;6(3):277-93 - PubMed
  36. Science. 2011 Dec 2;334(6060):1226-7 - PubMed
  37. Mol Biol Evol. 2005 Jan;22(1):63-73 - PubMed
  38. Bioinformatics. 2009 Aug 15;25(16):2078-9 - PubMed
  39. J Physiol Paris. 2012 May-Aug;106(3-4):159-70 - PubMed
  40. Genetics. 1989 Nov;123(3):585-95 - PubMed
  41. Nature. 2010 Oct 14;467(7317):753 - PubMed
  42. Brief Bioinform. 2013 Jul;14(4):391-401 - PubMed
  43. Mol Biol Evol. 2010 Feb;27(2):221-4 - PubMed
  44. Cold Spring Harb Perspect Biol. 2013 Oct 01;5(10):a010116 - PubMed
  45. Nucleic Acids Res. 2011 Jul;39(Web Server issue):W528-32 - PubMed
  46. Genetics. 1993 Mar;133(3):693-709 - PubMed

Publication Types