Display options
Share it on

JMIR Res Protoc. 2017 Aug 29;6(8):e175. doi: 10.2196/resprot.7757.

Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods.

JMIR research protocols

Gang Luo, Bryan L Stone, Michael D Johnson, Peter Tarczy-Hornoch, Adam B Wilcox, Sean D Mooney, Xiaoming Sheng, Peter J Haug, Flory L Nkoy

Affiliations

  1. Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, United States.
  2. Department of Pediatrics, University of Utah, Salt Lake City, UT, United States.
  3. Division of Neonatology, Department of Pediatrics, University of Washington, Seattle, WA, United States.
  4. Department of Computer Science and Engineering, University of Washington, Seattle, WA, United States.
  5. Homer Warner Research Center, Intermountain Healthcare, Murray, UT, United States.
  6. Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, United States.

PMID: 28851678 PMCID: PMC5596298 DOI: 10.2196/resprot.7757

Abstract

BACKGROUND: To improve health outcomes and cut health care costs, we often need to conduct prediction/classification using large clinical datasets (aka, clinical big data), for example, to identify high-risk patients for preventive interventions. Machine learning has been proposed as a key technology for doing this. Machine learning has won most data science competitions and could support many clinical activities, yet only 15% of hospitals use it for even limited purposes. Despite familiarity with data, health care researchers often lack machine learning expertise to directly use clinical big data, creating a hurdle in realizing value from their data. Health care researchers can work with data scientists with deep machine learning knowledge, but it takes time and effort for both parties to communicate effectively. Facing a shortage in the United States of data scientists and hiring competition from companies with deep pockets, health care systems have difficulty recruiting data scientists. Building and generalizing a machine learning model often requires hundreds to thousands of manual iterations by data scientists to select the following: (1) hyper-parameter values and complex algorithms that greatly affect model accuracy and (2) operators and periods for temporally aggregating clinical attributes (eg, whether a patient's weight kept rising in the past year). This process becomes infeasible with limited budgets.

OBJECTIVE: This study's goal is to enable health care researchers to directly use clinical big data, make machine learning feasible with limited budgets and data scientist resources, and realize value from data.

METHODS: This study will allow us to achieve the following: (1) finish developing the new software, Automated Machine Learning (Auto-ML), to automate model selection for machine learning with clinical big data and validate Auto-ML on seven benchmark modeling problems of clinical importance; (2) apply Auto-ML and novel methodology to two new modeling problems crucial for care management allocation and pilot one model with care managers; and (3) perform simulations to estimate the impact of adopting Auto-ML on US patient outcomes.

RESULTS: We are currently writing Auto-ML's design document. We intend to finish our study by around the year 2022.

CONCLUSIONS: Auto-ML will generalize to various clinical prediction/classification problems. With minimal help from data scientists, health care researchers can use Auto-ML to quickly build high-quality models. This will boost wider use of machine learning in health care and improve patient outcomes.

©Gang Luo, Bryan L Stone, Michael D Johnson, Peter Tarczy-Hornoch, Adam B Wilcox, Sean D Mooney, Xiaoming Sheng, Peter J Haug, Flory L Nkoy. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 29.08.2017.

Keywords: automated temporal aggregation; automatic model selection; care management; clinical big data; machine learning

References

  1. J Allergy Clin Immunol. 1999 Mar;103(3 Pt 1):436-40 - PubMed
  2. J Asthma. 1999 Jun;36(4):359-70 - PubMed
  3. J Am Soc Nephrol. 2000 Apr;11(4):740-6 - PubMed
  4. Pediatrics. 2000 May;105(5):1029-35 - PubMed
  5. Proc AMIA Symp. 2000;:923-7 - PubMed
  6. Diabetes Care. 2001 Sep;24(9):1547-55 - PubMed
  7. J Asthma. 2003 May;40(3):217-24 - PubMed
  8. Am J Manag Care. 2003 Aug;9(8):538-47 - PubMed
  9. Am J Manag Care. 2004 Jan;10(1):25-32 - PubMed
  10. Health Aff (Millwood). 2004 Jul-Dec;Suppl Web Exclusives:W4-427-36 - PubMed
  11. Dis Manag. 2005 Oct;8(5):277-87 - PubMed
  12. Eur Respir J. 2006 Dec;28(6):1145-55 - PubMed
  13. Pediatr Pulmonol. 2006 Oct;41(10):962-71 - PubMed
  14. J Gen Intern Med. 2007 Dec;22 Suppl 3:391-5 - PubMed
  15. Am J Kidney Dis. 2008 Jan;51(1):71-9 - PubMed
  16. Acad Emerg Med. 2008 Jan;15(1):40-4 - PubMed
  17. J Am Geriatr Soc. 2008 Dec;56(12):2195-202 - PubMed
  18. Clin J Am Soc Nephrol. 2010 May;5(5):814-20 - PubMed
  19. Am J Manag Care. 2010 May;16(5):379-84 - PubMed
  20. Chest. 2010 Nov;138(5):1156-65 - PubMed
  21. Arch Intern Med. 2010 Dec 13;170(22):1989-95 - PubMed
  22. Natl Health Stat Report. 2011 Jan 12;(32):1-14 - PubMed
  23. Health Aff (Millwood). 2011 Jun;30(6):1185-91 - PubMed
  24. J Am Med Inform Assoc. 2012 Jan-Feb;19(1):54-60 - PubMed
  25. Curr Opin Pulm Med. 2012 Jan;18(1):63-9 - PubMed
  26. AMIA Annu Symp Proc. 2011;2011:409-16 - PubMed
  27. Curr Opin Allergy Clin Immunol. 2012 Jun;12(3):263-8 - PubMed
  28. NCHS Data Brief. 2012 May;(94):1-8 - PubMed
  29. Med Care. 2012 Jul;50 Suppl:S30-5 - PubMed
  30. Pediatrics. 2012 Jul;130(1):e16-24 - PubMed
  31. Prim Care Respir J. 2012 Dec;21(4):405-11 - PubMed
  32. AMIA Annu Symp Proc. 2012;2012:901-10 - PubMed
  33. Am J Manag Care. 2013 Jan;19(1):60-7 - PubMed
  34. J Am Soc Nephrol. 2014 Jan;25(1):159-66 - PubMed
  35. Health Aff (Millwood). 2014 Jan;33(1):124-31 - PubMed
  36. J Am Med Inform Assoc. 2014 Jul-Aug;21(4):699-706 - PubMed
  37. PLoS One. 2014 Feb 10;9(2):e88225 - PubMed
  38. Biomed Eng Online. 2013;12 Suppl 1:S4 - PubMed
  39. Ann Emerg Med. 2015 Nov;66(5):511-20 - PubMed
  40. Stud Health Technol Inform. 2015;216:574-8 - PubMed
  41. Health Inf Sci Syst. 2015 Sep 28;3:3 - PubMed
  42. JMIR Res Protoc. 2015 Oct 26;4(4):e128 - PubMed
  43. Health Inf Sci Syst. 2016 Mar 08;4:2 - PubMed
  44. Yearb Med Inform. 2016 May 20;Suppl 1:S48-61 - PubMed
  45. Kidney Int. 2016 Aug;90(2):422-429 - PubMed
  46. Health Inf Sci Syst. 2016 Jun 08;4:5 - PubMed
  47. Genome Med. 2016 Jun 23;8(1):71 - PubMed
  48. Big Data. 2014 Sep;2(3):142-3 - PubMed
  49. AMIA Annu Symp Proc. 2017 Feb 10;2016:319-325 - PubMed
  50. Health Inf Sci Syst. 2017 Sep 27;5(1):2 - PubMed
  51. Stud Health Technol Inform. 2017;245:356-360 - PubMed
  52. Am J Respir Crit Care Med. 1998 Apr;157(4 Pt 1):1173-80 - PubMed
  53. J Clin Endocrinol Metab. 1998 Aug;83(8):2635-42 - PubMed

Publication Types