Display options
Share it on

J Cheminform. 2020 May 29;12(1):39. doi: 10.1186/s13321-020-00443-6.

QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping.

Journal of cheminformatics

C Škuta, I Cortés-Ciriano, W Dehaen, P Kříž, G J P van Westen, I V Tetko, A Bender, D Svozil

Affiliations

  1. CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i., Víde?ská 1083, 142 20, Prague 4, Czech Republic.
  2. Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
  3. CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic.
  4. Department of Mathematics, Faculty of Chemical Engineering, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic.
  5. Computational Drug Discovery, Drug Discovery and Safety, LACDR, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands.
  6. Helmholtz Zentrum Muenchen - German Research Center for Environmental Health (GmbH) and BIGCHEM GmbH, Ingolstaedter Landstrasse 1, 85764, Neuherberg, Germany.
  7. CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i., Víde?ská 1083, 142 20, Prague 4, Czech Republic. [email protected].
  8. CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic. [email protected].

PMID: 33431038 PMCID: PMC7260783 DOI: 10.1186/s13321-020-00443-6

Abstract

An affinity fingerprint is the vector consisting of compound's affinity or potency against the reference panel of protein targets. Here, we present the QAFFP fingerprint, 440 elements long in silico QSAR-based affinity fingerprint, components of which are predicted by Random Forest regression models trained on bioactivity data from the ChEMBL database. Both real-valued (rv-QAFFP) and binary (b-QAFFP) versions of the QAFFP fingerprint were implemented and their performance in similarity searching, biological activity classification and scaffold hopping was assessed and compared to that of the 1024 bits long Morgan2 fingerprint (the RDKit implementation of the ECFP4 fingerprint). In both similarity searching and biological activity classification, the QAFFP fingerprint yields retrieval rates, measured by AUC (~ 0.65 and ~ 0.70 for similarity searching depending on data sets, and ~ 0.85 for classification) and EF5 (~ 4.67 and ~ 5.82 for similarity searching depending on data sets, and ~ 2.10 for classification), comparable to that of the Morgan2 fingerprint (similarity searching AUC of ~ 0.57 and ~ 0.66, and EF5 of ~ 4.09 and ~ 6.41, depending on data sets, classification AUC of ~ 0.87, and EF5 of ~ 2.16). However, the QAFFP fingerprint outperforms the Morgan2 fingerprint in scaffold hopping as it is able to retrieve 1146 out of existing 1749 scaffolds, while the Morgan2 fingerprint reveals only 864 scaffolds.

Keywords: Affinity fingerprint; Bioactivity modeling; Biological fingerprint; QSAR; Scaffold hopping; Similarity searching

References

  1. J Chem Inf Model. 2016 Feb 22;56(2):390-8 - PubMed
  2. J Med Chem. 2010 Aug 12;53(15):5707-15 - PubMed
  3. Brief Bioinform. 2018 Mar 1;19(2):277-285 - PubMed
  4. Drug Discov Today. 2015 Apr;20(4):422-34 - PubMed
  5. Eur J Med Chem. 2007 Jul;42(7):966-76 - PubMed
  6. J Biomol Screen. 2014 Jun;19(5):771-81 - PubMed
  7. J Comput Aided Mol Des. 2008 Mar-Apr;22(3-4):193-9 - PubMed
  8. Comput Struct Biotechnol J. 2013 Feb 24;5:e201302002 - PubMed
  9. J Med Chem. 2012 Apr 12;55(7):2932-42 - PubMed
  10. J Chem Inf Model. 2010 Jul 26;50(7):1189-204 - PubMed
  11. J Chem Inf Model. 2014 Jul 28;54(7):1880-91 - PubMed
  12. J Chem Inf Model. 2006 Nov-Dec;46(6):2610-22 - PubMed
  13. J Chem Inf Model. 2011 Aug 22;51(8):1831-9 - PubMed
  14. J Comput Aided Mol Des. 2015 Sep;29(9):885-96 - PubMed
  15. Bioinformatics. 2016 Jan 1;32(1):85-95 - PubMed
  16. Mol Inform. 2016 May;35(5):160-80 - PubMed
  17. Drug Discov Today. 2015 Mar;20(3):318-31 - PubMed
  18. Mini Rev Med Chem. 2006 Nov;6(11):1217-29 - PubMed
  19. J Med Chem. 1996 Aug 16;39(17):3401-8 - PubMed
  20. J Chem Inf Comput Sci. 2001 Jan-Feb;41(1):181-5 - PubMed
  21. Nucleic Acids Res. 2017 Jan 4;45(D1):D955-D963 - PubMed
  22. J Cheminform. 2016 Sep 07;8(1):45 - PubMed
  23. ACS Chem Biol. 2016 May 20;11(5):1255-64 - PubMed
  24. Drug Discov Today. 2013 Jul;18(13-14):674-80 - PubMed
  25. J Chem Inf Model. 2009 Feb;49(2):169-84 - PubMed
  26. J Chem Inf Model. 2013 Nov 25;53(11):2829-36 - PubMed
  27. Sci Data. 2015 Jul 07;2:150032 - PubMed
  28. J Chem Inf Model. 2015 Jul 27;55(7):1316-22 - PubMed
  29. J Med Chem. 2001 Jul 19;44(15):2432-7 - PubMed
  30. Nucleic Acids Res. 2014 Jan;42(Database issue):D1083-90 - PubMed
  31. Proc Natl Acad Sci U S A. 2005 Jan 11;102(2):261-6 - PubMed
  32. J Chem Inf Model. 2016 Jul 25;56(7):1243-52 - PubMed
  33. J Chem Inf Model. 2014 Jun 23;54(6):1596-603 - PubMed
  34. Methods. 2015 Jan;71:58-63 - PubMed
  35. J Chem Inf Model. 2013 Aug 26;53(8):1957-66 - PubMed
  36. ACS Chem Biol. 2012 Aug 17;7(8):1399-409 - PubMed
  37. Science. 1992 Oct 16;258(5081):447-51 - PubMed
  38. Mol Pharmacol. 2004 Jun;65(6):1336-43 - PubMed
  39. J Mol Biol. 1996 Aug 23;261(3):470-89 - PubMed
  40. Curr Pharm Des. 2016;22(46):6885-6894 - PubMed
  41. J Chronic Dis. 1966 Sep;19(9):991-1006 - PubMed
  42. ACS Chem Biol. 2016 Nov 18;11(11):3024-3034 - PubMed
  43. BMC Struct Biol. 2010 Oct 05;10:32 - PubMed
  44. J Chem Inf Model. 2015 May 26;55(5):956-62 - PubMed
  45. J Med Chem. 2012 Dec 27;55(24):11067-71 - PubMed
  46. Mol Inform. 2013 Dec;32(11-12):898-905 - PubMed
  47. J Chem Inf Model. 2009 Jan;49(1):108-19 - PubMed
  48. J Cheminform. 2019 Jan 10;11(1):4 - PubMed
  49. Proc Natl Acad Sci U S A. 2008 Jul 1;105(26):9059-64 - PubMed
  50. Mol Inform. 2014 Jun;33(6-7):438-42 - PubMed
  51. Mol Inform. 2010 Jul 12;29(6-7):476-88 - PubMed
  52. Curr Top Med Chem. 2005;5(4):371-81 - PubMed
  53. Mol Inform. 2015 Jun;34(6-7):357-66 - PubMed
  54. Org Biomol Chem. 2004 Nov 21;2(22):3204-18 - PubMed
  55. SAR QSAR Environ Res. 2007 Jan-Mar;18(1-2):101-10 - PubMed
  56. Nucleic Acids Res. 2019 Jan 8;47(D1):D930-D940 - PubMed
  57. J Am Chem Soc. 2003 Sep 3;125(35):10543-5 - PubMed
  58. Nat Rev Cancer. 2006 Oct;6(10):813-23 - PubMed
  59. Drug Discov Today. 2013 Apr;18(7-8):358-64 - PubMed
  60. Nucleic Acids Res. 2012 Jan;40(Database issue):D1100-7 - PubMed
  61. Methods Mol Biol. 2013;930:499-526 - PubMed
  62. J Med Chem. 2017 Jan 12;60(1):474-485 - PubMed
  63. PLoS One. 2013 Oct 01;8(10):e75992 - PubMed
  64. J Chem Inf Comput Sci. 2000 Mar;40(2):246-53 - PubMed
  65. J Med Chem. 1996 Jul 19;39(15):2887-93 - PubMed
  66. Radiology. 1982 Apr;143(1):29-36 - PubMed
  67. J Chem Inf Model. 2019 Mar 25;59(3):962-972 - PubMed
  68. J Chem Inf Model. 2006 Nov-Dec;46(6):2445-56 - PubMed
  69. J Med Chem. 2014 Jun 26;57(12):4977-5010 - PubMed
  70. J Chem Inf Model. 2018 May 29;58(5):1132-1140 - PubMed
  71. Chem Biol Drug Des. 2014 Jul;84(1):75-85 - PubMed
  72. J Chem Inf Model. 2009 Feb;49(2):195-208 - PubMed
  73. Chem Biol. 1995 Feb;2(2):107-18 - PubMed
  74. Comb Chem High Throughput Screen. 2009 May;12(4):332-43 - PubMed
  75. ACS Omega. 2017 Jun 30;2(6):2805-2812 - PubMed
  76. J Chem Inf Model. 2007 Mar-Apr;47(2):488-508 - PubMed
  77. J Natl Cancer Inst. 1989 Jul 19;81(14):1088-92 - PubMed
  78. Drug Discov Today. 2012 Apr;17(7-8):310-24 - PubMed
  79. J Mol Graph Model. 2002 Jan;20(4):297-303 - PubMed
  80. Expert Opin Drug Discov. 2015 Dec;10(12):1283-300 - PubMed
  81. J Chem Inf Model. 2010 May 24;50(5):742-54 - PubMed
  82. Mol Inform. 2017 Mar;36(3): - PubMed
  83. Drug Discov Today. 2006 Aug;11(15-16):700-7 - PubMed
  84. J Med Chem. 2001 Feb 15;44(4):502-11 - PubMed
  85. J Med Chem. 2006 Nov 16;49(23):6789-801 - PubMed
  86. J Chem Inf Model. 2012 Nov 26;52(11):2884-901 - PubMed
  87. J Chem Inf Model. 2017 Aug 28;57(8):2077-2088 - PubMed
  88. J Chem Inf Model. 2018 Mar 26;58(3):641-646 - PubMed
  89. Science. 1997 Jan 17;275(5298):343-9 - PubMed
  90. J Med Chem. 2005 Nov 3;48(22):6918-25 - PubMed
  91. J Med Chem. 2004 Sep 23;47(20):4875-80 - PubMed
  92. J Am Chem Soc. 2004 Nov 17;126(45):14740-5 - PubMed
  93. Org Biomol Chem. 2004 Nov 21;2(22):3256-66 - PubMed
  94. PLoS One. 2013 Apr 16;8(4):e61007 - PubMed
  95. Comb Chem High Throughput Screen. 2011 Jul;14(6):475-87 - PubMed
  96. Drug Discov Today. 2002 Sep 1;7(17):903-11 - PubMed
  97. J Cheminform. 2013 May 30;5(1):26 - PubMed
  98. ACS Chem Biol. 2014 Jul 18;9(7):1622-31 - PubMed
  99. J Chem Inf Model. 2013 Nov 25;53(11):2837-50 - PubMed
  100. J Mol Graph Model. 2002 Jan;20(4):269-76 - PubMed
  101. J Med Chem. 2013 Nov 14;56(21):8377-88 - PubMed

Publication Types

Grant support