Display options
Share it on

J Acquir Immune Defic Syndr. 2021 Nov 22; doi: 10.1097/QAI.0000000000002869. Epub 2021 Nov 22.

Machine Learning Estimation of Low-Density Lipoprotein Cholesterol in Women with and without HIV.

Journal of acquired immune deficiency syndromes (1999)

Tony Dong, Mariam N Rana, Chris T Longenecker, Sanjay Rajagopalan, Chang H Kim, Sadeer G Al-Kindi

Affiliations

  1. Department of Internal Medicine, University Hospitals/Case Western Reserve University, Cleveland, OH Division of Cardiovascular Medicine, University Hospitals/Case Western Reserve University, Cleveland, OH Department of Medicine, MetroHealth Medical Center/Case Western Reserve University, Cleveland, OH.

PMID: 34813572 DOI: 10.1097/QAI.0000000000002869

Abstract

INTRODUCTION: Low-density lipoprotein cholesterol (LDL-C) is typically estimated from total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C) and triglycerides (TG). The Friedewald, Martin-Hopkins, and National Institute of Health [NIH] equations are widely used but may estimate LDL-C inaccurately in certain patient populations, such as those with HIV. We sought to investigate the utility of machine learning for LDL-C estimation in a large cohort of women with and without HIV.

METHODS: We identified 7397 direct LDL-C measurements (5219 HIV, 2127 uninfected controls, 51 seroconvertors) from 2414 participants (age 39.4 ± 9.3 years) in the Women's Interagency HIV Study, and estimated LDL-C using the Friedewald, Hopkins and NIH equations. We also optimized five machine learning methods (Linear Regression, Random Forest, Gradient Boosting, Support Vector Machine and Neural Network) using 80% of the data (training set). We compared the performance of each method utilizing root mean square error (RMSE), mean absolute error (MAE) and coefficient of determination (R2) in the holdout (20%) set.

RESULTS: Support Vector Machine (SVM) outperformed all 3 existing equations and other machine learning methods, achieving lowest RMSE, MAE and highest R2 (11.79, 7.98 mg/dL, 0.87 respectively, compared with Friedewald equation: 12.45, 9.14 mg/dL, 0.87). SVM performance remained superior in subgroups with and without HIV, with non-fasting measurements, in LDL <70 mg/dL and TG>400 mg/dL.

CONCLUSIONS: In this proof-of-concept study, SVM is a robust method that predicts directly measured LDL-C more accurately than clinically used methods in women with and without HIV. Further studies should explore the utility in broader populations.

Copyright © 2021 Wolters Kluwer Health, Inc. All rights reserved.

Conflict of interest statement

Conflicts of interest: There are no potential conflicts (financial, professional, or personal) to disclose by any of the authors.

Publication Types