Display options
Share it on

IEEE Int Workshop Genomic Signal Process Stat. 2012 Dec;2012:70-73. doi: 10.1109/GENSIPS.2012.6507729.

Improving the Flexibility of RNA-Seq Data Analysis Pipelines.

IEEE International Workshop on Genomic Signal Processing and Statistics : [proceedings]. IEEE International Workshop on Genomic Signal Processing and Statistics

John H Phan, Po-Yen Wu, May D Wang

Affiliations

  1. Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA, [email protected].
  2. Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA, [email protected].
  3. Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA, [email protected].

PMID: 27536420 PMCID: PMC4985025 DOI: 10.1109/GENSIPS.2012.6507729

Abstract

Accurate quantification of gene or isoform expression with RNA-Seq depends on complete knowledge of the transcriptome. Because a complete genomic annotation does not yet exist, novel isoform discovery is an important component of the RNA-Seq quantification process. Thus, a typical RNA-Seq pipeline includes a transcriptome mapping step to quantify known genes and isoforms, and a reference genome mapping step to discover new genes and isoforms. Several tools implement this approach, but are limited in that they force the use of a single mapping algorithm at both the transcriptome and reference genome mapping stages. The choice of mapping algorithm could affect quantification accuracy on a per-dataset basis. Thus, we describe a method that enables the merging of transcriptome and reference genome mapping stages provided that they conform to the standard SAM/BAM format. This procedure could potentially improve the accuracy of gene or isoform quantification by increasing flexibility when selecting RNA-Seq data analysis pipelines. We demonstrate an example of a flexible RNA-Seq pipeline by assessing its potential for novel isoform discovery and by validating its quantification performance using qRT-PCR.

References

  1. Bioinformatics. 2012 Jul 15;28(14):1933-4 - PubMed
  2. Nat Rev Genet. 2009 Jan;10(1):57-63 - PubMed
  3. Nat Methods. 2012 Mar 04;9(4):357-9 - PubMed
  4. Nat Methods. 2008 Jul;5(7):621-8 - PubMed
  5. Bioinformatics. 2009 May 1;25(9):1105-11 - PubMed
  6. Bioinformatics. 2009 Apr 15;25(8):1026-32 - PubMed
  7. Nat Biotechnol. 2010 May;28(5):511-5 - PubMed
  8. Nucleic Acids Res. 2012 Jan;40(Database issue):D84-90 - PubMed
  9. Bioinformatics. 2011 Sep 15;27(18):2518-28 - PubMed
  10. PLoS One. 2012;7(2):e31229 - PubMed
  11. Bioinformatics. 2010 Apr 1;26(7):873-81 - PubMed
  12. Nature. 2001 Feb 15;409(6822):860-921 - PubMed

Publication Types

Grant support