Display options
Share it on

Silence. 2011 Feb 28;2(1):2. doi: 10.1186/1758-907X-2-2.

Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments.

Silence

Kevin P McCormick, Matthew R Willmann, Blake C Meyers

Affiliations

  1. Department of Plant and Soil Sciences and Delaware Biotechnology Institute, University of Delaware, Newark, DE 19711, USA.
  2. Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA.

PMID: 21356093 PMCID: PMC3055805 DOI: 10.1186/1758-907X-2-2

Abstract

Prior to the advent of new, deep sequencing methods, small RNA (sRNA) discovery was dependent on Sanger sequencing, which was time-consuming and limited knowledge to only the most abundant sRNA. The innovation of large-scale, next-generation sequencing has exponentially increased knowledge of the biology, diversity and abundance of sRNA populations. In this review, we discuss issues involved in the design of sRNA sequencing experiments, including choosing a sequencing platform, inherent biases that affect sRNA measurements and replication. We outline the steps involved in preprocessing sRNA sequencing data and review both the principles behind and the current options for normalization. Finally, we discuss differential expression analysis in the absence and presence of biological replicates. While our focus is on sRNA sequencing experiments, many of the principles discussed are applicable to the sequencing of other RNA populations.

References

  1. Nucleic Acids Res. 2010 Jul;38(13):e142 - PubMed
  2. RNA. 2008 Oct;14(10):2095-103 - PubMed
  3. Nucleic Acids Res. 2006 Jan 23;34(2):e8 - PubMed
  4. Genes Dev. 2001 Jan 15;15(2):188-200 - PubMed
  5. Nucleic Acids Res. 2007;35(15):e97 - PubMed
  6. BMC Bioinformatics. 2009 Jan 30;10 Suppl 1:S24 - PubMed
  7. Nat Genet. 2008 Jun;40(6):722-9 - PubMed
  8. Cell. 2005 Dec 29;123(7):1279-91 - PubMed
  9. Nat Biotechnol. 2000 Jun;18(6):630-4 - PubMed
  10. Genome Biol. 2002 Jun 18;3(7):RESEARCH0034 - PubMed
  11. Biochimie. 1995;77(1-2):22-9 - PubMed
  12. Nucleic Acids Res. 2009 May;37(8):2461-70 - PubMed
  13. Physiol Genomics. 2008 Jun 12;34(1):95-100 - PubMed
  14. Science. 2005 Sep 2;309(5740):1567-9 - PubMed
  15. Bioinformatics. 2006 Apr 1;22(7):789-94 - PubMed
  16. Nature. 2008 Apr 17;452(7189):872-6 - PubMed
  17. Science. 2001 Oct 26;294(5543):862-4 - PubMed
  18. Genome Res. 2009 Oct;19(10):1843-8 - PubMed
  19. Nat Struct Mol Biol. 2008 Sep;15(9):998 - PubMed
  20. Science. 2008 Aug 15;321(5891):956-60 - PubMed
  21. Nat Struct Mol Biol. 2008 Jul;15(7):714-21 - PubMed
  22. Science. 2005 Feb 11;307(5711):932-5 - PubMed
  23. Genome Biol. 2009;10(3):R25 - PubMed
  24. Am J Hum Genet. 2010 Aug 13;87(2):237-49 - PubMed
  25. Biotechniques. 2002 Feb;32(2):330-2, 334, 336 - PubMed
  26. Nat Biotechnol. 2010 May;28(5):511-5 - PubMed
  27. BMC Bioinformatics. 2010 Feb 18;11:94 - PubMed
  28. Bioinformatics. 2010 Feb 15;26(4):493-500 - PubMed
  29. PLoS One. 2009 Sep 02;4(9):e6849 - PubMed
  30. Bioinformatics. 2007 Apr 15;23(8):988-97 - PubMed
  31. Nucleic Acids Res. 1998 Nov 1;26(21):4983-8 - PubMed
  32. Bioinformatics. 2002;18 Suppl 1:S96-104 - PubMed
  33. Genome Res. 2010 Feb;20(2):257-64 - PubMed
  34. Methods Mol Biol. 2010;576:375-407 - PubMed
  35. BMC Bioinformatics. 2009 Sep 23;10:310 - PubMed
  36. Biotechniques. 2005 May;38(5):739-45 - PubMed
  37. Biostatistics. 2003 Apr;4(2):249-64 - PubMed
  38. PLoS Genet. 2010 Sep 30;6(9):e1001141 - PubMed
  39. Biochemistry. 1985 Jan 15;24(2):267-73 - PubMed
  40. BMC Bioinformatics. 2006 Mar 15;7:137 - PubMed
  41. Nature. 2009 Feb 19;457(7232):1028-32 - PubMed
  42. Genome Res. 2008 Jul;18(7):1051-63 - PubMed
  43. BMC Genomics. 2003 Sep 22;4(1):38 - PubMed
  44. Nature. 2006 Jul 13;442(7099):203-7 - PubMed
  45. Genes Dev. 2006 Jul 1;20(13):1709-14 - PubMed
  46. Genome Res. 2008 Apr;18(4):610-21 - PubMed
  47. Eur J Biochem. 1982 Jul;125(3):639-43 - PubMed
  48. RNA. 2010 Dec;16(12):2537-52 - PubMed
  49. BMC Genomics. 2010 May 05;11:282 - PubMed
  50. Genome Biol. 2010;11(5):R50 - PubMed
  51. Proc Natl Acad Sci U S A. 2010 Aug 24;107(34):15269-74 - PubMed
  52. Nucleic Acids Res. 2001 Jun 15;29(12):2549-57 - PubMed
  53. BMC Bioinformatics. 2008 Dec 04;9:520 - PubMed
  54. Dev Cell. 2008 Jun;14(6):854-66 - PubMed
  55. Science. 2001 Oct 26;294(5543):853-8 - PubMed
  56. Science. 2009 Apr 10;324(5924):218-23 - PubMed
  57. Cell. 1993 Dec 3;75(5):843-54 - PubMed
  58. Genes Dev. 2006 Aug 15;20(16):2214-22 - PubMed
  59. Cell. 1993 Dec 3;75(5):855-62 - PubMed
  60. Nat Methods. 2008 Oct;5(10):887-93 - PubMed
  61. Science. 2006 Jul 21;313(5785):363-7 - PubMed
  62. PLoS Biol. 2007 Mar;5(3):e57 - PubMed
  63. Nucleic Acids Res. 2006 Jul 13;34(12):e84 - PubMed
  64. Nat Biotechnol. 2009 Jan;27(1):66-75 - PubMed
  65. Nat Biotechnol. 2009 Jul;27(7):652-8 - PubMed
  66. Nat Methods. 2008 Mar;5(3):235-7 - PubMed
  67. Genes Dev. 2004 Oct 1;18(19):2368-79 - PubMed
  68. RNA. 2009 May;15(5):992-1002 - PubMed
  69. Nucleic Acids Res. 2010 Aug;38(14):e151 - PubMed
  70. Genomics. 2000 Sep 1;68(2):136-43 - PubMed
  71. Genome Res. 2002 Feb;12(2):292-7 - PubMed
  72. Genome Biol. 2010;11(3):R25 - PubMed
  73. BMC Genomics. 2009 Apr 09;10:155 - PubMed
  74. Nucleic Acids Res. 2004 Oct 12;32(18):5471-9 - PubMed
  75. Bioinformatics. 2008 Mar 1;24(5):713-4 - PubMed
  76. RNA. 2005 Jun;11(6):849-52 - PubMed
  77. Bioinformatics. 2009 Sep 15;25(18):2334-40 - PubMed
  78. J Cell Biochem Suppl. 2001;Suppl 37:120-5 - PubMed
  79. Curr Issues Mol Biol. 2002 Apr;4(2):57-64 - PubMed
  80. Genes Dev. 2005 Sep 15;19(18):2164-75 - PubMed
  81. Curr Biol. 2008 May 20;18(10):758-762 - PubMed
  82. Mol Cell. 2004 Oct 8;16(1):69-79 - PubMed
  83. Genome Res. 2008 Sep;18(9):1509-17 - PubMed
  84. Science. 2001 Oct 26;294(5543):858-62 - PubMed
  85. Bioinformatics. 2007 Oct 15;23(20):2700-7 - PubMed
  86. Nature. 2008 Nov 6;456(7218):60-5 - PubMed
  87. PLoS One. 2007 Feb 14;2(2):e197 - PubMed
  88. Nucleic Acids Res. 2008 Dec;36(21):e141 - PubMed
  89. Cell. 2005 Apr 22;121(2):207-21 - PubMed
  90. Science. 2008 Jun 6;320(5881):1344-9 - PubMed
  91. PLoS Comput Biol. 2008 Oct 03;4(10):e1000189 - PubMed
  92. Nature. 2008 Mar 13;452(7184):215-9 - PubMed
  93. Nat Methods. 2009 Jul;6(7):474-6 - PubMed
  94. Bioinformatics. 2010 Jan 1;26(1):139-40 - PubMed
  95. Genes Dev. 2002 Jul 1;16(13):1616-26 - PubMed
  96. Cell. 2000 Mar 31;101(1):25-33 - PubMed
  97. Genome Res. 1997 Oct;7(10):986-95 - PubMed
  98. Nature. 1998 Feb 19;391(6669):806-11 - PubMed
  99. Science. 2006 Jul 21;313(5785):320-4 - PubMed
  100. Science. 2007 Jan 12;315(5809):244-7 - PubMed
  101. Nature. 2005 Sep 15;437(7057):376-80 - PubMed
  102. Science. 2002 Sep 13;297(5588):1831 - PubMed
  103. Nature. 2008 Nov 6;456(7218):53-9 - PubMed
  104. Nat Methods. 2010 Feb;7(2):130-2 - PubMed
  105. EMBO J. 2000 Oct 2;19(19):5194-201 - PubMed
  106. Science. 2007 Jun 8;316(5830):1481-4 - PubMed
  107. Science. 2008 Apr 4;320(5872):106-9 - PubMed
  108. Cell. 2001 Nov 16;107(4):465-76 - PubMed
  109. Comp Funct Genomics. 2004;5(3):245-52 - PubMed
  110. Nat Methods. 2008 Jul;5(7):621-8 - PubMed
  111. Nucleic Acids Res. 2006 Jan 30;34(2):667-75 - PubMed
  112. Plant Cell. 2002 Apr;14(4):857-67 - PubMed
  113. Proc Natl Acad Sci U S A. 1978 Mar;75(3):1270-3 - PubMed
  114. Biotechniques. 2000 Aug;29(2):332-7 - PubMed
  115. RNA. 2006 Apr;12(4):589-97 - PubMed
  116. Mol Syst Biol. 2010 Oct 5;6:419 - PubMed
  117. Development. 2008 Apr;135(7):1201-14 - PubMed
  118. Nature. 2008 Jun 5;453(7196):798-802 - PubMed
  119. BMC Genomics. 2007 Nov 12;8:414 - PubMed
  120. Nucleic Acids Res. 1976 Jun;3(6):1613-23 - PubMed
  121. Genes Genet Syst. 2004 Aug;79(4):189-97 - PubMed
  122. Plant Cell. 2002 Jul;14(7):1605-19 - PubMed
  123. Science. 1999 Oct 29;286(5441):950-2 - PubMed
  124. EMBO J. 2002 Sep 2;21(17):4671-9 - PubMed
  125. Nat Methods. 2009 Apr;6(4):291-5 - PubMed
  126. Proc Natl Acad Sci U S A. 2001 Jan 2;98(1):31-6 - PubMed
  127. Nucleic Acids Res. 2010 Nov;38(20):6883-94 - PubMed
  128. J Biotechnol. 1999 Oct 8;75(2-3):291-5 - PubMed
  129. Nat Methods. 2007 Aug;4(8):651-7 - PubMed
  130. Science. 2007 Jan 12;315(5809):241-4 - PubMed
  131. Science. 2002 Sep 20;297(5589):2053-6 - PubMed
  132. PLoS One. 2008 Aug 06;3(8):e2871 - PubMed
  133. BMC Bioinformatics. 2006 Dec 15;7:533 - PubMed
  134. Nat Biotechnol. 2008 Aug;26(8):941-6 - PubMed
  135. Nat Protoc. 2008;3(2):267-78 - PubMed
  136. Nucleic Acids Res. 2008 Nov;36(19):e122 - PubMed
  137. Nature. 2010 Sep 2;467(7311):103-7 - PubMed
  138. Nature. 2011 Jan 20;469(7330):368-73 - PubMed
  139. Nature. 2006 Jul 13;442(7099):199-202 - PubMed
  140. Bioinformatics. 2003 Jan 22;19(2):185-93 - PubMed
  141. RNA. 2009 Dec;15(12):2147-60 - PubMed
  142. Cell. 2006 Jan 27;124(2):343-54 - PubMed
  143. Nature. 2008 May 22;453(7194):539-43 - PubMed
  144. Science. 2005 Sep 9;309(5741):1728-32 - PubMed
  145. PLoS One. 2008;3(12):e4012 - PubMed
  146. Genes Dev. 2009 Nov 15;23(22):2639-49 - PubMed
  147. PLoS Pathog. 2008 Nov;4(11):e1000219 - PubMed
  148. BMC Genomics. 2010 Jun 17;11:383 - PubMed
  149. Plant Physiol. 2005 Sep;139(1):5-17 - PubMed
  150. Proc Natl Acad Sci U S A. 2005 Sep 20;102(38):13398-403 - PubMed
  151. Bioinformatics. 2002;18 Suppl 1:S105-10 - PubMed
  152. Genes Dev. 2006 Jul 1;20(13):1732-43 - PubMed
  153. BMC Genomics. 2007 Jun 07;8:153 - PubMed
  154. Plant Physiol. 2010 Nov;154(3):1024-39 - PubMed
  155. Science. 2001 Aug 3;293(5531):834-8 - PubMed
  156. Science. 2009 Jan 2;323(5910):133-8 - PubMed
  157. J Cell Biochem. 2000 Oct 20;80(2):192-202 - PubMed
  158. Science. 2002 Sep 20;297(5589):2056-60 - PubMed
  159. Biochemistry. 1978 May 30;17(11):2069-76 - PubMed

Publication Types