Display options
Share it on

Pharmaceutics. 2021 May 26;13(6). doi: 10.3390/pharmaceutics13060794.

Biomedical Text Link Prediction for Drug Discovery: A Case Study with COVID-19.

Pharmaceutics

Kevin McCoy, Sateesh Gudapati, Lawrence He, Elaina Horlander, David Kartchner, Soham Kulkarni, Nidhi Mehra, Jayant Prakash, Helena Thenot, Sri Vivek Vanga, Abigail Wagner, Brandon White, Cassie S Mitchell

Affiliations

  1. Laboratory for Pathology Dynamics, Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA.
  2. Computer Science, Georgia Institute of Technology, Atlanta, GA 30332, USA.
  3. Computer Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA.
  4. Biochemistry, Georgia Institute of Technology, Atlanta, GA 30332, USA.
  5. Institute for Machine Learning, Georgia Institute of Technology, Atlanta, GA 30332, USA.

PMID: 34073456 PMCID: PMC8230210 DOI: 10.3390/pharmaceutics13060794

Abstract

Link prediction in artificial intelligence is used to identify missing links or derive future relationships that can occur in complex networks. A link prediction model was developed using the complex heterogeneous biomedical knowledge graph, SemNet, to predict missing links in biomedical literature for drug discovery. A web application visualized knowledge graph embeddings and link prediction results using TransE, CompleX, and RotatE based methods. The link prediction model achieved up to 0.44 hits@10 on the entity prediction tasks. The recent outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as COVID-19, served as a case study to demonstrate the efficacy of link prediction modeling for drug discovery. The link prediction algorithm guided identification and ranking of repurposed drug candidates for SARS-CoV-2 primarily by text mining biomedical literature from previous coronaviruses, including SARS and middle east respiratory syndrome (MERS). Repurposed drugs included potential primary SARS-CoV-2 treatment, adjunctive therapies, or therapeutics to treat side effects. The link prediction accuracy for nodes ranked highly for SARS coronavirus was 0.875 as calculated by human in the loop validation on existing COVID-19 specific data sets. Drug classes predicted as highly ranked include anti-inflammatory, nucleoside analogs, protease inhibitors, antimalarials, envelope proteins, and glycoproteins. Examples of highly ranked predicted links to SARS-CoV-2: human leukocyte interferon, recombinant interferon-gamma, cyclosporine, antiviral therapy, zidovudine, chloroquine, vaccination, methotrexate, artemisinin, alkaloids, glycyrrhizic acid, quinine, flavonoids, amprenavir, suramin, complement system proteins, fluoroquinolones, bone marrow transplantation, albuterol, ciprofloxacin, quinolone antibacterial agents, and hydroxymethylglutaryl-CoA reductase inhibitors. Approximately 40% of identified drugs were not previously connected to SARS, such as edetic acid or biotin. In summary, link prediction can effectively suggest repurposed drugs for emergent diseases.

Keywords: COVID-19; SARS-CoV-2; coronavirus; literature review; machine learning; natural language processing; repurposed drugs; text mining

References

  1. Can J Physiol Pharmacol. 2015 Dec;93(12):1091-6 - PubMed
  2. Nucleic Acids Res. 2017 Jan 4;45(D1):D353-D361 - PubMed
  3. Free Radic Biol Med. 2020 Aug 20;156:107-112 - PubMed
  4. Nature. 2020 Mar;579(7798):193 - PubMed
  5. Comput Struct Biotechnol J. 2020 Jun 02;18:1414-1428 - PubMed
  6. Nat Med. 2020 Apr;26(4):465 - PubMed
  7. Front Bioeng Biotechnol. 2019 Jul 03;7:156 - PubMed
  8. J Inorg Biochem. 2020 Oct;211:111179 - PubMed
  9. Virology. 2020 Dec;551:1-9 - PubMed
  10. Mol Med. 2020 Sep 29;26(1):91 - PubMed
  11. Bioinformatics. 2020 Feb 15;36(4):1241-1251 - PubMed
  12. Annu Rev Virol. 2016 Sep 29;3(1):237-261 - PubMed
  13. Expert Rev Anti Infect Ther. 2020 Sep;18(9):843-847 - PubMed
  14. Virol J. 2019 May 27;16(1):69 - PubMed
  15. Lancet. 2020 May 16;395(10236):1569-1578 - PubMed
  16. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70 - PubMed
  17. Lancet. 2003 Jul 26;362(9380):293-4 - PubMed
  18. ACS Infect Dis. 2021 Jun 11;7(6):1423-1432 - PubMed
  19. J Med Virol. 2021 Mar;93(3):1780-1785 - PubMed
  20. J Chin Med Assoc. 2020 Mar;83(3):217-220 - PubMed
  21. BMJ Open Gastroenterol. 2020 Jun;7(1): - PubMed
  22. Science. 2020 May 1;368(6490):489-493 - PubMed
  23. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D668-72 - PubMed
  24. Bioinformatics. 2020 Jan 15;36(2):603-610 - PubMed
  25. J Mol Biol. 2007 May 11;368(4):1075-86 - PubMed
  26. J Hepatol. 2021 Jan;74(1):168-184 - PubMed
  27. Bioinformatics. 2012 Dec 1;28(23):3158-60 - PubMed
  28. Life Sci. 2020 Jun 15;251:117627 - PubMed

Publication Types

Grant support