Display options
Share it on

JMIR Med Inform. 2017 Oct 31;5(4):e42. doi: 10.2196/medinform.8531.

Ranking Medical Terms to Support Expansion of Lay Language Resources for Patient Comprehension of Electronic Health Record Notes: Adapted Distant Supervision Approach.

JMIR medical informatics

Jinying Chen, Abhyuday N Jagannatha, Samah J Fodeh, Hong Yu

Affiliations

  1. Department of Quantitative Health Sicences, University of Massachusetts Medical School, Worcester, MA, United States.
  2. School of Computer Science, University of Massachusetts, Amherst, MA, United States.
  3. Yale Center for Medical Informatics, Yale University, New Haven, CT, United States.
  4. Bedford Veterans Affairs Medical Center, Bedford, MA, United States.

PMID: 29089288 PMCID: PMC5686421 DOI: 10.2196/medinform.8531

Abstract

BACKGROUND: Medical terms are a major obstacle for patients to comprehend their electronic health record (EHR) notes. Clinical natural language processing (NLP) systems that link EHR terms to lay terms or definitions allow patients to easily access helpful information when reading through their EHR notes, and have shown to improve patient EHR comprehension. However, high-quality lay language resources for EHR terms are very limited in the public domain. Because expanding and curating such a resource is a costly process, it is beneficial and even necessary to identify terms important for patient EHR comprehension first.

OBJECTIVE: We aimed to develop an NLP system, called adapted distant supervision (ADS), to rank candidate terms mined from EHR corpora. We will give EHR terms ranked as high by ADS a higher priority for lay language annotation-that is, creating lay definitions for these terms.

METHODS: Adapted distant supervision uses distant supervision from consumer health vocabulary and transfer learning to adapt itself to solve the problem of ranking EHR terms in the target domain. We investigated 2 state-of-the-art transfer learning algorithms (ie, feature space augmentation and supervised distant supervision) and designed 5 types of learning features, including distributed word representations learned from large EHR data for ADS. For evaluating ADS, we asked domain experts to annotate 6038 candidate terms as important or nonimportant for EHR comprehension. We then randomly divided these data into the target-domain training data (1000 examples) and the evaluation data (5038 examples). We compared ADS with 2 strong baselines, including standard supervised learning, on the evaluation data.

RESULTS: The ADS system using feature space augmentation achieved the best average precision, 0.850, on the evaluation set when using 1000 target-domain training examples. The ADS system using supervised distant supervision achieved the best average precision, 0.819, on the evaluation set when using only 100 target-domain training examples. The 2 ADS systems both performed significantly better than the baseline systems (P<.001 for all measures and all conditions). Using a rich set of learning features contributed to ADS's performance substantially.

CONCLUSIONS: ADS can effectively rank terms mined from EHRs. Transfer learning improved ADS's performance even with a small number of target-domain training examples. EHR terms prioritized by ADS were used to expand a lay language resource that supports patient EHR comprehension. The top 10,000 EHR terms ranked by ADS are available upon request.

©Jinying Chen, Abhyuday N Jagannatha, Samah J Fodeh, Hong Yu. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 31.10.2017.

Keywords: electronic health records; information extraction; lexical entry selection; natural language processing; transfer learning

References

  1. J Biomed Inform. 2004 Dec;37(6):396-410 - PubMed
  2. Pac Symp Biocomput. 2012;:376-87 - PubMed
  3. J Am Med Inform Assoc. 2015 Sep;22(5):967-79 - PubMed
  4. Psychooncology. 2003 Sep;12(6):557-66 - PubMed
  5. J Med Internet Res. 2013 Mar 27;15(3):e65 - PubMed
  6. AMIA Annu Symp Proc. 2005;:859-63 - PubMed
  7. Health Serv Res. 2014 Feb;49(1 Pt 2):325-46 - PubMed
  8. JAMA. 2001 May 23-30;285(20):2612-21 - PubMed
  9. J Biomed Semantics. 2016 Sep 26;7(1):58 - PubMed
  10. J Am Med Inform Assoc. 2008 Jul-Aug;15(4):473-83 - PubMed
  11. Methods Inf Med. 2002;41(4):289-98 - PubMed
  12. J Am Med Inform Assoc. 2013 Sep-Oct;20(5):931-9 - PubMed
  13. AMIA Annu Symp Proc. 2003;:674-8 - PubMed
  14. Bioinformatics. 2010 Apr 15;26(8):1098-104 - PubMed
  15. Br Med J (Clin Res Ed). 1986 Mar 1;292(6520):596-8 - PubMed
  16. J Fam Pract. 1999 Jan;48(1):58-61 - PubMed
  17. JMIR Med Inform. 2016 Nov 30;4(4):e40 - PubMed
  18. Otolaryngol Head Neck Surg. 2012 Sep;147(3):466-71 - PubMed
  19. J Med Internet Res. 2016 Oct 04;18(10 ):e264 - PubMed
  20. Ann Intern Med. 2012 Oct 2;157(7):461-70 - PubMed
  21. J Med Internet Res. 2013 Aug 26;15(8):e168 - PubMed
  22. J Mach Learn Res. 2016;17 : - PubMed
  23. AMIA Annu Symp Proc. 2006;:239-43 - PubMed
  24. J Am Med Inform Assoc. 2010 May-Jun;17(3):229-36 - PubMed
  25. J Med Internet Res. 2015 Jun 23;17(6):e148 - PubMed
  26. Med Care. 2013 Mar;51(3 Suppl 1):S52-6 - PubMed
  27. J Biomed Inform. 2017 Apr;68:121-131 - PubMed
  28. J Med Internet Res. 2007 Feb 28;9(1):e4 - PubMed
  29. Proc AMIA Symp. 1999;:107-11 - PubMed
  30. Stud Health Technol Inform. 2013;192:714-8 - PubMed
  31. J Health Commun. 2010;15 Suppl 2:183-96 - PubMed
  32. Hawaii J Med Public Health. 2014 Feb;73(2):44-8 - PubMed
  33. J Med Internet Res. 2011 May 17;13(2):e37 - PubMed
  34. Stud Health Technol Inform. 2001;84(Pt 1):399-403 - PubMed
  35. AMIA Annu Symp Proc. 2007 Oct 11;:399-403 - PubMed
  36. J Med Internet Res. 2015 May 07;17(5):e112 - PubMed
  37. N Engl J Med. 2009 Mar 12;360(11):1057-60 - PubMed
  38. J Am Med Inform Assoc. 2006 Jan-Feb;13(1):24-9 - PubMed
  39. J Med Internet Res. 2015 Dec 03;17(12):e275 - PubMed
  40. Am J Emerg Med. 2000 Nov;18(7):764-6 - PubMed
  41. J Biomed Inform. 2012 Feb;45(1):71-81 - PubMed
  42. AMIA Annu Symp Proc. 2007 Oct 11;:846-50 - PubMed
  43. AMIA Annu Symp Proc. 2010 Nov 13;2010:366-70 - PubMed
  44. Science. 2011 Jan 14;331(6014):176-82 - PubMed
  45. Health Bull (Edinb). 1992 Mar;50(2):143-50 - PubMed
  46. Br J Gen Pract. 2004 Jan;54(498):38-43 - PubMed
  47. Proc Int Conf Intell Syst Mol Biol. 1999;:77-86 - PubMed
  48. J Med Internet Res. 2012 Jan 27;14(1):e19 - PubMed
  49. J Med Internet Res. 2001 Jul-Sep;3(3):E24 - PubMed
  50. J Am Med Inform Assoc. 2005 Mar-Apr;12(2):152-63 - PubMed
  51. BMC Bioinformatics. 2008 Apr 29;9 Suppl 5:S5 - PubMed
  52. AMIA Annu Symp Proc. 2013 Nov 16;2013:600-9 - PubMed
  53. J Med Internet Res. 2017 Jun 19;19(6):e218 - PubMed
  54. J Am Med Inform Assoc. 2008 Jul-Aug;15(4):496-505 - PubMed

Publication Types

Grant support