Display options
Share it on

BMC Bioinformatics. 2018 Oct 10;19(1):368. doi: 10.1186/s12859-018-2341-9.

Genome-wide analysis of fitness data and its application to improve metabolic models.

BMC bioinformatics

Edward Vitkin, Oz Solomon, Sharon Sultan, Zohar Yakhini

Affiliations

  1. Department of Computer Science, Technion, Haifa, Israel.
  2. Faculty of Biotechnology and Food Engineering, Technion, Haifa, Israel. [email protected].
  3. School of Computer Science, The Interdisciplinary Center, Herzliya, Israel. [email protected].
  4. School of Computer Science, The Interdisciplinary Center, Herzliya, Israel.
  5. Department of Computer Science, Technion, Haifa, Israel. [email protected].
  6. School of Computer Science, The Interdisciplinary Center, Herzliya, Israel. [email protected].

PMID: 30305012 PMCID: PMC6180484 DOI: 10.1186/s12859-018-2341-9

Abstract

BACKGROUND: Synthetic biology and related techniques enable genome scale high-throughput investigation of the effect on organism fitness of different gene knock-downs/outs and of other modifications of genomic sequence.

RESULTS: We develop statistical and computational pipelines and frameworks for analyzing high throughput fitness data over a genome scale set of sequence variants. Analyzing data from a high-throughput knock-down/knock-out bacterial study, we investigate differences and determinants of the effect on fitness in different conditions. Comparing fitness vectors of genes, across tens of conditions, we observe that fitness consequences strongly depend on genomic location and more weakly depend on gene sequence similarity and on functional relationships. In analyzing promoter sequences, we identified motifs associated with conditions studied in bacterial media such as Casaminos, D-glucose, Sucrose, and other sugars and amino-acid sources. We also use fitness data to infer genes associated with orphan metabolic reactions in the iJO1366 E. coli metabolic model. To do this, we developed a new computational method that integrates gene fitness and gene expression profiles within a given reaction network neighborhood to associate this reaction with a set of genes that potentially encode the catalyzing proteins. We then apply this approach to predict candidate genes for 107 orphan reactions in iJO1366. Furthermore - we validate our methodology with known reactions using a leave-one-out approach. Specifically, using top-20 candidates selected based on combined fitness and expression datasets, we correctly reconstruct 39.7% of the reactions, as compared to 33% based on fitness and to 26% based on expression separately, and to 4.02% as a random baseline. Our model improvement results include a novel association of a gene to an orphan cytosine nucleosidation reaction.

CONCLUSION: Our pipeline for metabolic modeling shows a clear benefit of using fitness data for predicting genes of orphan reactions. Along with the analysis pipelines we developed, it can be used to analyze similar high-throughput data.

Keywords: Co-expression; Co-fitness; Fitness data; Flux balance analysis (FBA); Metabolic modelling; Orphan reactions

References

  1. Nat Biotechnol. 2014 Nov;32(11):1146-50 - PubMed
  2. Nature. 2002 Nov 14;420(6912):190-3 - PubMed
  3. Bioinformatics. 2004 Aug 4;20 Suppl 1:i178-85 - PubMed
  4. J Biol Chem. 2001 Jan 12;276(2):884-94 - PubMed
  5. Nat Biotechnol. 2012 Feb 26;30(3):271-7 - PubMed
  6. Nucleic Acids Res. 2013 Feb 1;41(3):e45 - PubMed
  7. Curr Opin Biotechnol. 2014 Oct;29:39-45 - PubMed
  8. PLoS Genet. 2011 Nov;7(11):e1002385 - PubMed
  9. Mol Syst Biol. 2011 Oct 11;7:535 - PubMed
  10. Nucleic Acids Res. 2016 Jan 4;44(D1):D133-43 - PubMed
  11. Nat Rev Microbiol. 2013 Jul;11(7):435-42 - PubMed
  12. Nature. 2018 May;557(7706):503-509 - PubMed
  13. Mol Syst Biol. 2017 Jan 16;13(1):907 - PubMed
  14. Bioinformatics. 2010 Feb 15;26(4):536-43 - PubMed
  15. Appl Environ Microbiol. 2012 Jan;78(1):70-80 - PubMed
  16. Bioinformatics. 2007 Jul 1;23(13):i205-11 - PubMed
  17. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W253-8 - PubMed
  18. Nucleic Acids Res. 2017 Jan 4;45(D1):D543-D550 - PubMed
  19. Nucleic Acids Res. 2016 Jan 4;44(D1):D279-85 - PubMed
  20. J Bacteriol. 1973 Oct;116(1):12-8 - PubMed
  21. BMC Genomics. 2015 Feb 05;16:37 - PubMed
  22. Curr Opin Biotechnol. 2017 Oct;47:67-82 - PubMed
  23. Proc Natl Acad Sci U S A. 1989 Feb;86(3):830-4 - PubMed
  24. Algorithms Mol Biol. 2014 Apr 05;9(1):11 - PubMed
  25. Nat Rev Genet. 2014 Feb;15(2):107-20 - PubMed
  26. Nature. 2010 May 20;465(7296):363-7 - PubMed
  27. Nucleic Acids Res. 2006 Jan 05;34(1):1-9 - PubMed
  28. Nucleic Acids Res. 2013 Jan;41(Database issue):D613-24 - PubMed
  29. Appl Microbiol Biotechnol. 2004 Oct;65(5):576-82 - PubMed
  30. BMC Bioinformatics. 2006 Mar 29;7:177 - PubMed
  31. Genome Biol. 2012 Nov 29;13(11):R111 - PubMed
  32. Biotechnol Bioeng. 2003 Dec 20;84(6):647-57 - PubMed
  33. Trends Genet. 2014 Jul;30(7):287-97 - PubMed
  34. Curr Opin Biotechnol. 2016 Feb;37:127-134 - PubMed
  35. Nat Biotechnol. 2004 Jul;22(7):911-7 - PubMed
  36. BMC Genomics. 2011 Aug 01;12:385 - PubMed
  37. MBio. 2015 May 12;6(3):e00306-15 - PubMed
  38. Nat Rev Genet. 2013 Jun;14(6):390-403 - PubMed
  39. Nucleic Acids Res. 2013 Jul;41(Web Server issue):W174-9 - PubMed
  40. Genome Res. 2014 Jun;24(6):999-1011 - PubMed
  41. Sci Rep. 2016 Jun 13;6:27761 - PubMed
  42. BMC Bioinformatics. 2004 Jun 09;5:76 - PubMed
  43. Nat Biotechnol. 2012 May 20;30(6):521-30 - PubMed
  44. Nat Biotechnol. 2010 Mar;28(3):245-8 - PubMed
  45. Genes Dev. 2005 Dec 1;19(23):2816-26 - PubMed
  46. Bioinformatics. 2016 Sep 1;32(17):i559-i566 - PubMed
  47. Environ Microbiol. 2002 Mar;4(3):133-40 - PubMed
  48. J Bacteriol. 2002 Dec;184(23):6602-14 - PubMed
  49. Genome Res. 2014 Oct;24(10):1698-706 - PubMed
  50. Science. 2016 Jan 15;351(6270):null - PubMed
  51. Genome Biol. 2007;8(2):R24 - PubMed
  52. Front Microbiol. 2014 Aug 13;5:402 - PubMed
  53. Nucleic Acids Res. 2000 Jan 1;28(1):33-6 - PubMed
  54. PLoS Comput Biol. 2007 Mar 23;3(3):e39 - PubMed
  55. Genome Biol. 2006;7(2):R17 - PubMed

MeSH terms

Publication Types