JMIR Res Protoc. 2017 Aug 29;6(8):e175. doi: 10.2196/resprot.7757.
Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods.
JMIR research protocols
Gang Luo, Bryan L Stone, Michael D Johnson, Peter Tarczy-Hornoch, Adam B Wilcox, Sean D Mooney, Xiaoming Sheng, Peter J Haug, Flory L Nkoy
Affiliations
Affiliations
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, United States.
- Department of Pediatrics, University of Utah, Salt Lake City, UT, United States.
- Division of Neonatology, Department of Pediatrics, University of Washington, Seattle, WA, United States.
- Department of Computer Science and Engineering, University of Washington, Seattle, WA, United States.
- Homer Warner Research Center, Intermountain Healthcare, Murray, UT, United States.
- Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, United States.
PMID: 28851678
PMCID: PMC5596298 DOI: 10.2196/resprot.7757
Abstract
BACKGROUND: To improve health outcomes and cut health care costs, we often need to conduct prediction/classification using large clinical datasets (aka, clinical big data), for example, to identify high-risk patients for preventive interventions. Machine learning has been proposed as a key technology for doing this. Machine learning has won most data science competitions and could support many clinical activities, yet only 15% of hospitals use it for even limited purposes. Despite familiarity with data, health care researchers often lack machine learning expertise to directly use clinical big data, creating a hurdle in realizing value from their data. Health care researchers can work with data scientists with deep machine learning knowledge, but it takes time and effort for both parties to communicate effectively. Facing a shortage in the United States of data scientists and hiring competition from companies with deep pockets, health care systems have difficulty recruiting data scientists. Building and generalizing a machine learning model often requires hundreds to thousands of manual iterations by data scientists to select the following: (1) hyper-parameter values and complex algorithms that greatly affect model accuracy and (2) operators and periods for temporally aggregating clinical attributes (eg, whether a patient's weight kept rising in the past year). This process becomes infeasible with limited budgets.
OBJECTIVE: This study's goal is to enable health care researchers to directly use clinical big data, make machine learning feasible with limited budgets and data scientist resources, and realize value from data.
METHODS: This study will allow us to achieve the following: (1) finish developing the new software, Automated Machine Learning (Auto-ML), to automate model selection for machine learning with clinical big data and validate Auto-ML on seven benchmark modeling problems of clinical importance; (2) apply Auto-ML and novel methodology to two new modeling problems crucial for care management allocation and pilot one model with care managers; and (3) perform simulations to estimate the impact of adopting Auto-ML on US patient outcomes.
RESULTS: We are currently writing Auto-ML's design document. We intend to finish our study by around the year 2022.
CONCLUSIONS: Auto-ML will generalize to various clinical prediction/classification problems. With minimal help from data scientists, health care researchers can use Auto-ML to quickly build high-quality models. This will boost wider use of machine learning in health care and improve patient outcomes.
©Gang Luo, Bryan L Stone, Michael D Johnson, Peter Tarczy-Hornoch, Adam B Wilcox, Sean D Mooney, Xiaoming Sheng, Peter J Haug, Flory L Nkoy. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 29.08.2017.
Keywords: automated temporal aggregation; automatic model selection; care management; clinical big data; machine learning
References
- J Allergy Clin Immunol. 1999 Mar;103(3 Pt 1):436-40 - PubMed
- J Asthma. 1999 Jun;36(4):359-70 - PubMed
- J Am Soc Nephrol. 2000 Apr;11(4):740-6 - PubMed
- Pediatrics. 2000 May;105(5):1029-35 - PubMed
- Proc AMIA Symp. 2000;:923-7 - PubMed
- Diabetes Care. 2001 Sep;24(9):1547-55 - PubMed
- J Asthma. 2003 May;40(3):217-24 - PubMed
- Am J Manag Care. 2003 Aug;9(8):538-47 - PubMed
- Am J Manag Care. 2004 Jan;10(1):25-32 - PubMed
- Health Aff (Millwood). 2004 Jul-Dec;Suppl Web Exclusives:W4-427-36 - PubMed
- Dis Manag. 2005 Oct;8(5):277-87 - PubMed
- Eur Respir J. 2006 Dec;28(6):1145-55 - PubMed
- Pediatr Pulmonol. 2006 Oct;41(10):962-71 - PubMed
- J Gen Intern Med. 2007 Dec;22 Suppl 3:391-5 - PubMed
- Am J Kidney Dis. 2008 Jan;51(1):71-9 - PubMed
- Acad Emerg Med. 2008 Jan;15(1):40-4 - PubMed
- J Am Geriatr Soc. 2008 Dec;56(12):2195-202 - PubMed
- Clin J Am Soc Nephrol. 2010 May;5(5):814-20 - PubMed
- Am J Manag Care. 2010 May;16(5):379-84 - PubMed
- Chest. 2010 Nov;138(5):1156-65 - PubMed
- Arch Intern Med. 2010 Dec 13;170(22):1989-95 - PubMed
- Natl Health Stat Report. 2011 Jan 12;(32):1-14 - PubMed
- Health Aff (Millwood). 2011 Jun;30(6):1185-91 - PubMed
- J Am Med Inform Assoc. 2012 Jan-Feb;19(1):54-60 - PubMed
- Curr Opin Pulm Med. 2012 Jan;18(1):63-9 - PubMed
- AMIA Annu Symp Proc. 2011;2011:409-16 - PubMed
- Curr Opin Allergy Clin Immunol. 2012 Jun;12(3):263-8 - PubMed
- NCHS Data Brief. 2012 May;(94):1-8 - PubMed
- Med Care. 2012 Jul;50 Suppl:S30-5 - PubMed
- Pediatrics. 2012 Jul;130(1):e16-24 - PubMed
- Prim Care Respir J. 2012 Dec;21(4):405-11 - PubMed
- AMIA Annu Symp Proc. 2012;2012:901-10 - PubMed
- Am J Manag Care. 2013 Jan;19(1):60-7 - PubMed
- J Am Soc Nephrol. 2014 Jan;25(1):159-66 - PubMed
- Health Aff (Millwood). 2014 Jan;33(1):124-31 - PubMed
- J Am Med Inform Assoc. 2014 Jul-Aug;21(4):699-706 - PubMed
- PLoS One. 2014 Feb 10;9(2):e88225 - PubMed
- Biomed Eng Online. 2013;12 Suppl 1:S4 - PubMed
- Ann Emerg Med. 2015 Nov;66(5):511-20 - PubMed
- Stud Health Technol Inform. 2015;216:574-8 - PubMed
- Health Inf Sci Syst. 2015 Sep 28;3:3 - PubMed
- JMIR Res Protoc. 2015 Oct 26;4(4):e128 - PubMed
- Health Inf Sci Syst. 2016 Mar 08;4:2 - PubMed
- Yearb Med Inform. 2016 May 20;Suppl 1:S48-61 - PubMed
- Kidney Int. 2016 Aug;90(2):422-429 - PubMed
- Health Inf Sci Syst. 2016 Jun 08;4:5 - PubMed
- Genome Med. 2016 Jun 23;8(1):71 - PubMed
- Big Data. 2014 Sep;2(3):142-3 - PubMed
- AMIA Annu Symp Proc. 2017 Feb 10;2016:319-325 - PubMed
- Health Inf Sci Syst. 2017 Sep 27;5(1):2 - PubMed
- Stud Health Technol Inform. 2017;245:356-360 - PubMed
- Am J Respir Crit Care Med. 1998 Apr;157(4 Pt 1):1173-80 - PubMed
- J Clin Endocrinol Metab. 1998 Aug;83(8):2635-42 - PubMed
Publication Types