Display options
Share it on

J Cheminform. 2019 Aug 08;11(1):54. doi: 10.1186/s13321-019-0376-1.

Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability.

Journal of cheminformatics

Oliver Laufkötter, Noé Sturm, Jürgen Bajorath, Hongming Chen, Ola Engkvist

Affiliations

  1. Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden. [email protected].
  2. Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany. [email protected].
  3. Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden.
  4. Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany.
  5. Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden. [email protected].

PMID: 31396716 PMCID: PMC6686534 DOI: 10.1186/s13321-019-0376-1

Abstract

This study aims at improving upon existing activity predictions methods by augmenting chemical structure fingerprints with bio-activity based fingerprints derived from high-throughput screening (HTS) data (HTSFPs) and thereby showcasing the benefits of combining different descriptor types. This type of descriptor would be applied in an iterative screening scenario for more targeted compound set selection. The HTSFPs were generated from HTS data obtained from PubChem and combined with an ECFP4 structural fingerprint. The bioactivity-structure hybrid (BaSH) fingerprint was benchmarked against the individual ECFP4 and HTSFP fingerprints. Their performance was evaluated via retrospective analysis of a subset of the PubChem HTS data. Results showed that the BaSH fingerprint has improved predictive performance as well as scaffold hopping capability. The BaSH fingerprint identified unique compounds compared to both the ECFP4 and the HTSFP fingerprint indicating synergistic effects between the two fingerprints. A feature importance analysis showed that a small subset of the HTSFP features contribute most to the overall performance of the BaSH fingerprint. This hybrid approach allows for activity prediction of compounds with only sparse HTSFPs due to the supporting effect from the structural fingerprint.

Keywords: Activity prediction; Circular fingerprints; ECFP; HTSFP; High throughput screening; Machine learning; Random forest; Scaffold hopping

References

  1. Trends Biotechnol. 2002 Apr;20(4):167-73 - PubMed
  2. Science. 1992 Oct 16;258(5081):447-51 - PubMed
  3. Proc Natl Acad Sci U S A. 2005 Jan 11;102(2):261-6 - PubMed
  4. J Biomol Screen. 2004 Dec;9(8):678-86 - PubMed
  5. IDrugs. 2006 Mar;9(3):199-204 - PubMed
  6. Science. 2008 Jul 11;321(5886):263-6 - PubMed
  7. Curr Opin Pharmacol. 2009 Oct;9(5):580-8 - PubMed
  8. ACS Chem Biol. 2012 Aug 17;7(8):1399-409 - PubMed
  9. J Chem Inf Model. 2013 Mar 25;53(3):692-703 - PubMed
  10. ACS Chem Biol. 2014 Jul 18;9(7):1622-31 - PubMed
  11. J Chem Inf Model. 2014 Jul 28;54(7):1880-91 - PubMed
  12. Drug Discov Today. 2015 Apr;20(4):422-34 - PubMed
  13. ACS Chem Biol. 2016 May 20;11(5):1255-64 - PubMed
  14. J Chem Inf Model. 2016 Feb 22;56(2):390-8 - PubMed
  15. J Natl Cancer Inst. 1989 Jul 19;81(14):1088-92 - PubMed
  16. Mol Inform. 2013 Jan;32(1):37-45 - PubMed
  17. J Chem Inf Model. 2018 Mar 26;58(3):641-646 - PubMed
  18. Cell Chem Biol. 2018 May 17;25(5):611-618.e3 - PubMed
  19. J Chem Inf Model. 2018 May 29;58(5):957-967 - PubMed
  20. Int J Mol Sci. 2018 Aug 10;19(8):null - PubMed
  21. J Chem Inf Model. 2019 Mar 25;59(3):962-972 - PubMed
  22. Chem Biol. 1995 Feb;2(2):107-18 - PubMed

Publication Types

Grant support