Display options
Share it on

Genome Med. 2021 Aug 30;13(1):138. doi: 10.1186/s13073-021-00953-4.

GenTB: A user-friendly genome-based predictor for tuberculosis resistance powered by machine learning.

Genome medicine

Matthias I Gröschel, Martin Owens, Luca Freschi, Roger Vargas, Maximilian G Marin, Jody Phelan, Zamin Iqbal, Avika Dixit, Maha R Farhat

Affiliations

  1. Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
  2. Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
  3. Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK.
  4. European Bioinformatics Institute, Hinxton, Cambridge, CB10 ISD, UK.
  5. Division of Infectious Diseases, Boston Children's Hospital, Boston, MA, USA.
  6. Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. [email protected].
  7. Division of Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, MA, USA. [email protected].

PMID: 34461978 PMCID: PMC8407037 DOI: 10.1186/s13073-021-00953-4

Abstract

BACKGROUND: Multidrug-resistant Mycobacterium tuberculosis (Mtb) is a significant global public health threat. Genotypic resistance prediction from Mtb DNA sequences offers an alternative to laboratory-based drug-susceptibility testing. User-friendly and accurate resistance prediction tools are needed to enable public health and clinical practitioners to rapidly diagnose resistance and inform treatment regimens.

RESULTS: We present Translational Genomics platform for Tuberculosis (GenTB), a free and open web-based application to predict antibiotic resistance from next-generation sequence data. The user can choose between two potential predictors, a Random Forest (RF) classifier and a Wide and Deep Neural Network (WDNN) to predict phenotypic resistance to 13 and 10 anti-tuberculosis drugs, respectively. We benchmark GenTB's predictive performance along with leading TB resistance prediction tools (Mykrobe and TB-Profiler) using a ground truth dataset of 20,408 isolates with laboratory-based drug susceptibility data. All four tools reliably predicted resistance to first-line tuberculosis drugs but had varying performance for second-line drugs. The mean sensitivities for GenTB-RF and GenTB-WDNN across the nine shared drugs were 77.6% (95% CI 76.6-78.5%) and 75.4% (95% CI 74.5-76.4%), respectively, and marginally higher than the sensitivities of TB-Profiler at 74.4% (95% CI 73.4-75.3%) and Mykrobe at 71.9% (95% CI 70.9-72.9%). The higher sensitivities were at an expense of ≤ 1.5% lower specificity: Mykrobe 97.6% (95% CI 97.5-97.7%), TB-Profiler 96.9% (95% CI 96.7 to 97.0%), GenTB-WDNN 96.2% (95% CI 96.0 to 96.4%), and GenTB-RF 96.1% (95% CI 96.0 to 96.3%). Averaged across the four tools, genotypic resistance sensitivity was 11% and 9% lower for isoniazid and rifampicin respectively, on isolates sequenced at low depth (< 10× across 95% of the genome) emphasizing the need to quality control input sequence data before prediction. We discuss differences between tools in reporting results to the user including variants underlying the resistance calls and any novel or indeterminate variants CONCLUSIONS: GenTB is an easy-to-use online tool to rapidly and accurately predict resistance to anti-tuberculosis drugs. GenTB can be accessed online at https://gentb.hms.harvard.edu , and the source code is available at https://github.com/farhat-lab/gentb-site .

© 2021. The Author(s).

Keywords: Diagnostics; Drug resistance; Drug-susceptibility testing; MDR-TB; Machine learning; Tuberculosis; Whole genome sequencing; XDR-TB

References

  1. Nucleic Acids Res. 2017 Jan 4;45(D1):D535-D542 - PubMed
  2. PeerJ. 2018 Nov 13;6:e5895 - PubMed
  3. PLoS One. 2015 Nov 13;10(11):e0142951 - PubMed
  4. EBioMedicine. 2019 May;43:356-369 - PubMed
  5. Am J Respir Crit Care Med. 2016 Sep 1;194(5):621-30 - PubMed
  6. Wellcome Open Res. 2019 Dec 2;4:191 - PubMed
  7. Lancet Respir Med. 2016 Jan;4(1):49-58 - PubMed
  8. Bioinformatics. 2018 Mar 1;34(5):867-868 - PubMed
  9. J Clin Microbiol. 2015 Jun;53(6):1908-14 - PubMed
  10. J Clin Microbiol. 2018 Apr 25;56(5): - PubMed
  11. Nat Rev Microbiol. 2019 Sep;17(9):533-545 - PubMed
  12. Nat Microbiol. 2018 Sep;3(9):1032-1042 - PubMed
  13. Sci Rep. 2019 Jun 26;9(1):9305 - PubMed
  14. Eur Respir J. 2017 Dec 28;50(6): - PubMed
  15. Eur Respir J. 2018 Jun 28;51(6): - PubMed
  16. Tuberculosis (Edinb). 2015 Dec;95(6):843-844 - PubMed
  17. J Clin Microbiol. 2017 Feb;55(2):457-469 - PubMed
  18. Lancet Infect Dis. 2018 Jun;18(6):675-683 - PubMed
  19. Mol Microbiol. 1995 Jan;15(2):235-45 - PubMed
  20. Bioinformatics. 2012 Oct 1;28(19):2520-2 - PubMed
  21. Bioinformatics. 2018 Sep 1;34(17):i884-i890 - PubMed
  22. Antimicrob Agents Chemother. 2021 Aug 30;:AAC0116421 - PubMed
  23. Nat Commun. 2017 Sep 19;8(1):588 - PubMed
  24. Nat Commun. 2020 Apr 3;11(1):1661 - PubMed
  25. Genome Biol. 2014 Mar 03;15(3):R46 - PubMed
  26. Lancet Microbe. 2021 Mar;2(3):e96-e104 - PubMed
  27. Sci Rep. 2018 Oct 18;8(1):15382 - PubMed
  28. Am J Respir Cell Mol Biol. 1994 Dec;11(6):639-43 - PubMed
  29. Lancet Respir Med. 2017 Apr;5(4):269-281 - PubMed
  30. N Engl J Med. 2018 Oct 11;379(15):1403-1415 - PubMed
  31. Nat Genet. 2018 Feb;50(2):307-316 - PubMed
  32. Expert Rev Anti Infect Ther. 2018 May;16(5):433-442 - PubMed
  33. Nat Methods. 2018 Jul;15(7):475-476 - PubMed
  34. BMC Genomics. 2014 Oct 09;15:881 - PubMed
  35. J Antimicrob Chemother. 2020 Aug 1;75(8):2031-2043 - PubMed
  36. Adv Exp Med Biol. 2017;1019:221-246 - PubMed
  37. PLoS One. 2014 Nov 19;9(11):e112963 - PubMed
  38. Genome Med. 2019 Jun 24;11(1):41 - PubMed
  39. Bioinformatics. 2009 Aug 15;25(16):2078-9 - PubMed
  40. Nat Commun. 2019 May 13;10(1):2128 - PubMed
  41. PLoS One. 2015 Mar 23;10(3):e0119628 - PubMed
  42. Bioinformatics. 2018 Sep 15;34(18):3094-3100 - PubMed

Publication Types

Grant support