Display options
Share it on

Open Biol. 2016 Sep;6(9). doi: 10.1098/rsob.160183.

Prediction and validation of protein-protein interactors from genome-wide DNA-binding data using a knowledge-based machine-learning approach.

Open biology

Ashley J Waardenberg, Bernou Homan, Stephanie Mohamed, Richard P Harvey, Romaric Bouveret

Affiliations

  1. Victor Chang Cardiac Research Institute, Darlinghurst, New South Wales 2010, Australia Children's Medical Research Institute, University of Sydney, Westmead, New South Wales 2145, Australia [email protected].
  2. Victor Chang Cardiac Research Institute, Darlinghurst, New South Wales 2010, Australia.
  3. Victor Chang Cardiac Research Institute, Darlinghurst, New South Wales 2010, Australia St Vincent's Clinical School, University of Sydney, Westmead, New South Wales 2145, Australia School of Biotechnology and Biomolecular Science, University of New South Wales, Kensington, New South Wales 2052, Australia [email protected].
  4. Victor Chang Cardiac Research Institute, Darlinghurst, New South Wales 2010, Australia St Vincent's Clinical School, University of Sydney, Westmead, New South Wales 2145, Australia.

PMID: 27683156 PMCID: PMC5043580 DOI: 10.1098/rsob.160183

Abstract

The ability to accurately predict the DNA targets and interacting cofactors of transcriptional regulators from genome-wide data can significantly advance our understanding of gene regulatory networks. NKX2-5 is a homeodomain transcription factor that sits high in the cardiac gene regulatory network and is essential for normal heart development. We previously identified genomic targets for NKX2-5 in mouse HL-1 atrial cardiomyocytes using DNA-adenine methyltransferase identification (DamID). Here, we apply machine learning algorithms and propose a knowledge-based feature selection method for predicting NKX2-5 protein : protein interactions based on motif grammar in genome-wide DNA-binding data. We assessed model performance using leave-one-out cross-validation and a completely independent DamID experiment performed with replicates. In addition to identifying previously described NKX2-5-interacting proteins, including GATA, HAND and TBX family members, a number of novel interactors were identified, with direct protein : protein interactions between NKX2-5 and retinoid X receptor (RXR), paired-related homeobox (PRRX) and Ikaros zinc fingers (IKZF) validated using the yeast two-hybrid assay. We also found that the interaction of RXRα with NKX2-5 mutations found in congenital heart disease (Q187H, R189G and R190H) was altered. These findings highlight an intuitive approach to accessing protein-protein interaction information of transcription factors in DNA-binding experiments.

© 2016 The Authors.

Keywords: gene regulatory networks; machine learning; protein–protein interactions; transcription factors

References

  1. Dev Cell. 2004 Sep;7(3):331-45 - PubMed
  2. Pediatr Blood Cancer. 2012 Jul 15;59(1):69-76 - PubMed
  3. Genes Dev. 1990 Oct;4(10):1741-52 - PubMed
  4. Genome Res. 2010 Mar;20(3):381-92 - PubMed
  5. Cardiovasc Res. 2004 Oct 1;64(1):40-51 - PubMed
  6. Circ Res. 2000 Nov 10;87(10 ):888-95 - PubMed
  7. J Biol Chem. 2002 Jul 12;277(28):25775-82 - PubMed
  8. Yeast. 2000 Jun 30;16(9):857-60 - PubMed
  9. J Stat Softw. 2010;33(1):1-22 - PubMed
  10. Am J Med Genet A. 2015 Dec;167A(12 ):2966-74 - PubMed
  11. Development. 2014 Feb;141(4):878-88 - PubMed
  12. Cell. 2012 Aug 3;150(3):590-605 - PubMed
  13. Proc Natl Acad Sci U S A. 1998 Mar 17;95(6):2979-84 - PubMed
  14. Cell. 2012 Sep 28;151(1):221-32 - PubMed
  15. Genes Dev. 1995 Jul 1;9(13):1654-66 - PubMed
  16. PLoS Genet. 2011 Aug;7(8):e1002207 - PubMed
  17. Nucleic Acids Res. 2015 Jan;43(Database issue):D470-8 - PubMed
  18. Mol Cell Biol. 1985 Dec;5(12):3610-6 - PubMed
  19. PLoS One. 2011;6(12 ):e28688 - PubMed
  20. Science. 2006 Feb 10;311(5762):796-800 - PubMed
  21. Nucleic Acids Res. 2015 Jan;43(Database issue):D447-52 - PubMed
  22. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W253-8 - PubMed
  23. Genes Dev. 2002 May 15;16(10):1167-81 - PubMed
  24. Nature. 2010 Feb 18;463(7283):913-8 - PubMed
  25. Nature. 1997 Jun 26;387(6636):913-7 - PubMed
  26. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D108-10 - PubMed
  27. J Biol Chem. 1995 Jun 30;270(26):15628-33 - PubMed
  28. Proc Natl Acad Sci U S A. 2011 Apr 5;108(14):5632-7 - PubMed
  29. ScientificWorldJournal. 2008 Feb 25;8:194-211 - PubMed
  30. J Cell Biol. 2004 Feb 2;164(3):395-405 - PubMed
  31. PLoS Genet. 2012;8(3):e1002531 - PubMed
  32. J Clin Invest. 1999 Dec;104(11):1567-73 - PubMed
  33. Bioinformatics. 2005 Oct 15;21(20):3940-1 - PubMed
  34. Dev Cell. 2007 Aug;13(2):254-67 - PubMed
  35. Circulation. 2011 Apr 19;123(15):1633-41 - PubMed
  36. Nat Immunol. 2014 Mar;15(3):283-93 - PubMed
  37. Mol Cell Biol. 1998 Jun;18(6):3120-9 - PubMed
  38. Genome Biol. 2004;5(10):R80 - PubMed
  39. Nucleic Acids Res. 2009 Jan;37(Database issue):D767-72 - PubMed
  40. Nucleic Acids Res. 2014 Jan;42(Database issue):D142-7 - PubMed
  41. PLoS Genet. 2013;9(1):e1003195 - PubMed
  42. Proc Natl Acad Sci U S A. 2009 Jan 20;106(3):814-9 - PubMed
  43. Mol Syst Biol. 2011 Dec 06;7:555 - PubMed
  44. J Biol Chem. 2004 Mar 12;279(11):10659-69 - PubMed
  45. Circ Res. 2010 Apr 2;106(6):1083-91 - PubMed
  46. J Mol Cell Cardiol. 2002 Oct;34(10 ):1335-44 - PubMed
  47. Cold Spring Harb Perspect Med. 2014 Oct 03;4(11):a013839 - PubMed
  48. Nucleic Acids Res. 2006;34(17):4925-36 - PubMed
  49. Bioinformatics. 2010 Mar 15;26(6):841-2 - PubMed
  50. J Clin Invest. 1996 Sep 15;98(6):1332-43 - PubMed
  51. Blood. 2008 Feb 1;111(3):1138-46 - PubMed
  52. Proc Natl Acad Sci U S A. 2008 Feb 26;105(8):2913-8 - PubMed
  53. PLoS Genet. 2007 Aug;3(8):e140 - PubMed
  54. Cell. 2001 Sep 21;106(6):709-21 - PubMed
  55. Cell. 2007 Mar 9;128(5):947-59 - PubMed
  56. Cell. 2012 Sep 28;151(1):206-20 - PubMed
  57. Arch Biochem Biophys. 2015 Mar 1;569:45-53 - PubMed
  58. Elife. 2015 Jul 06;4:null - PubMed
  59. Proc Natl Acad Sci U S A. 2003 Aug 5;100(16):9371-6 - PubMed
  60. Hum Mutat. 2002 Jul;20(1):75-6 - PubMed
  61. Science. 1998 Jul 3;281(5373):108-11 - PubMed
  62. FEBS Lett. 2005 Oct 10;579(24):5265-74 - PubMed
  63. BMC Genomics. 2010 Jul 05;11:414 - PubMed
  64. Bioinformatics. 2011 Jun 15;27(12):1653-9 - PubMed
  65. Nat Rev Genet. 2012 Sep;13(9):613-26 - PubMed
  66. Nucleic Acids Res. 2004 Feb 26;32(4):1372-81 - PubMed
  67. J Invest Dermatol. 1998 Jul;111(1):57-63 - PubMed
  68. Genome Res. 2010 Aug;20(8):1064-83 - PubMed
  69. Comput Struct Biotechnol J. 2014 Jan 31;9:e201401002 - PubMed
  70. Genes Dev. 2007 Aug 1;21(15):1882-94 - PubMed
  71. Proc Natl Acad Sci U S A. 2005 Dec 20;102(51):18455-60 - PubMed
  72. Genome Res. 2009 Nov;19(11):2090-100 - PubMed
  73. BMC Bioinformatics. 2015 Sep 02;16:275 - PubMed
  74. Dev Biol. 2006 Sep 15;297(2):566-86 - PubMed
  75. J Am Coll Cardiol. 2003 Jun 4;41(11):2072-6 - PubMed
  76. Nucleic Acids Res. 2014 Jan;42(Database issue):D358-63 - PubMed
  77. Mech Dev. 1995 Jul;52(1):51-64 - PubMed

Publication Types