Front Genet. 2015 May 19;6:172. doi: 10.3389/fgene.2015.00172. eCollection 2015.
FAST: FAST Analysis of Sequences Toolbox.
Frontiers in genetics
Travis J Lawrence, Kyle T Kauffman, Katherine C H Amrine, Dana L Carper, Raymond S Lee, Peter J Becich, Claudia J Canales, David H Ardell
Affiliations
Affiliations
- Quantitative and Systems Biology Program, University of California, Merced Merced, CA, USA.
- Molecular Cell Biology Unit, School of Natural Sciences, University of California, Merced Merced, CA, USA.
- Quantitative and Systems Biology Program, University of California, Merced Merced, CA, USA ; Department of Viticulture and Enology, University of California, Davis Davis, CA, USA.
- School of Engineering, University of California, Merced Merced, CA, USA.
- Quantitative and Systems Biology Program, University of California, Merced Merced, CA, USA ; Molecular Cell Biology Unit, School of Natural Sciences, University of California, Merced Merced, CA, USA.
PMID: 26042145
PMCID: PMC4437040 DOI: 10.3389/fgene.2015.00172
Abstract
FAST (FAST Analysis of Sequences Toolbox) provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU's Not Unix) Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R, and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics make FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format). Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.
Keywords: BioPerl; MultiFASTA; NCBI taxonomy; Unix philosophy; bioinformatic workflow; open source; pipeline; regular expression
References
- Nucleic Acids Res. 2015 Jan;43(Database issue):D662-9 - PubMed
- Science. 1985 Mar 22;227(4693):1435-41 - PubMed
- Genome Res. 2002 Oct;12(10):1611-8 - PubMed
- Methods Mol Biol. 2014;1150:21-43 - PubMed
- Nucleic Acids Res. 2009 Jan;37(Database issue):D26-31 - PubMed
- Bioinformatics. 2010 Feb 1;26(3):437-9 - PubMed
- F1000Res. 2014 Feb 19;3:62 - PubMed
- Bioinformatics. 2009 May 1;25(9):1189-91 - PubMed
- Biotechniques. 2000 Jun;28(6):1102, 1104 - PubMed
- Front Genet. 2014 May 01;5:114 - PubMed
- Nat Genet. 2009 Feb;41(2):149-55 - PubMed
- FASEB J. 2014 Sep;28(9):3847-55 - PubMed
- Theor Popul Biol. 1975 Apr;7(2):256-76 - PubMed
- Proc Natl Acad Sci U S A. 1979 Oct;76(10):5269-73 - PubMed
- Clin Chem. 2011 May;57(5):688-90 - PubMed
- Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W15-9 - PubMed
- Bioinformatics. 2009 Jun 1;25(11):1451-2 - PubMed
- Bioinformatics. 2004 Aug 12;20(12):1986-8 - PubMed
- Science. 2013 May 17;340(6134):814-5 - PubMed
- Theor Popul Biol. 1972 Mar;3(1):87-112 - PubMed
- Nucleic Acids Res. 2015 Jan;43(Database issue):D670-81 - PubMed
- Science. 2012 Apr 13;336(6078):159-60 - PubMed
- Trends Genet. 2000 Jun;16(6):276-7 - PubMed
- Genetics. 2003 Dec;165(4):1761-77 - PubMed
- Nucleic Acids Res. 2009 Jan;37(Database issue):D5-15 - PubMed
- Annu Rev Biochem. 1985;54:631-64 - PubMed
- Bioinformatics. 2015 Jan 1;31(1):143-5 - PubMed
- Genetics. 1995 Sep;141(1):413-29 - PubMed
- Nat Med. 2010 Jun;16(6):618 - PubMed
- Nucleic Acids Res. 2014 Jan;42(Database issue):D560-7 - PubMed
- Gigascience. 2013 Nov 13;2(1):15 - PubMed
- Comput Appl Biosci. 1994 Dec;10(6):671-5 - PubMed
- Nucleic Acids Res. 2015 Jan;43(Database issue):D1049-56 - PubMed
- Biostatistics. 2009 Jul;10(3):405-8 - PubMed
- J Biomol NMR. 1995 Nov;6(3):277-93 - PubMed
- Science. 2011 Dec 2;334(6060):1226-7 - PubMed
- Mol Biol Evol. 2005 Jan;22(1):63-73 - PubMed
- Bioinformatics. 2009 Aug 15;25(16):2078-9 - PubMed
- J Physiol Paris. 2012 May-Aug;106(3-4):159-70 - PubMed
- Genetics. 1989 Nov;123(3):585-95 - PubMed
- Nature. 2010 Oct 14;467(7317):753 - PubMed
- Brief Bioinform. 2013 Jul;14(4):391-401 - PubMed
- Mol Biol Evol. 2010 Feb;27(2):221-4 - PubMed
- Cold Spring Harb Perspect Biol. 2013 Oct 01;5(10):a010116 - PubMed
- Nucleic Acids Res. 2011 Jul;39(Web Server issue):W528-32 - PubMed
- Genetics. 1993 Mar;133(3):693-709 - PubMed
Publication Types