Machine Learning-Based Models Incorporating Social Determinants of Health vs Traditional Models for Predicting In-Hospital Mortality in Patients with Heart Failure

Matthew W. Segar; Jennifer L. Hall; Pardeep S. Jhund; Tiffany M. Powell-Wiley; Alanna A. Morris; David Kao; Gregg C. Fonarow; Rosalba Hernandez; Nasrien E. Ibrahim; Christine Rutan; Ann Marie Navar; Laura M. Stevens; Ambarish Pandey

doi:10.1001/jamacardio.2022.1900

Machine Learning-Based Models Incorporating Social Determinants of Health vs Traditional Models for Predicting In-Hospital Mortality in Patients with Heart Failure

Matthew W. Segar, Jennifer L. Hall, Pardeep S. Jhund, Tiffany M. Powell-Wiley, Alanna A. Morris, David Kao, Gregg C. Fonarow, Rosalba Hernandez, Nasrien E. Ibrahim, Christine Rutan, Ann Marie Navar, Laura M. Stevens, Ambarish Pandey

Research output: Contribution to journal › Article › peer-review

28 Scopus citations

Abstract

Importance: Traditional models for predicting in-hospital mortality for patients with heart failure (HF) have used logistic regression and do not account for social determinants of health (SDOH). Objective: To develop and validate novel machine learning (ML) models for HF mortality that incorporate SDOH. Design, Setting, and Participants: This retrospective study used the data from the Get With The Guidelines-Heart Failure (GWTG-HF) registry to identify HF hospitalizations between January 1, 2010, and December 31, 2020. The study included patients with acute decompensated HF who were hospitalized at the GWTG-HF participating centers during the study period. Data analysis was performed January 6, 2021, to April 26, 2022. External validation was performed in the hospitalization cohort from the Atherosclerosis Risk in Communities (ARIC) study between 2005 and 2014. Main Outcomes and Measures: Random forest-based ML approaches were used to develop race-specific and race-agnostic models for predicting in-hospital mortality. Performance was assessed using C index (discrimination), regression slopes for observed vs predicted mortality rates (calibration), and decision curves for prognostic utility. Results: The training data set included 123634 hospitalized patients with HF who were enrolled in the GWTG-HF registry (mean [SD] age, 71 [13] years; 58356 [47.2%] female individuals; 65278 [52.8%] male individuals. Patients were analyzed in 2 categories: Black (23453 [19.0%]) and non-Black (2121 [2.1%] Asian; 91154 [91.0%] White, and 6906 [6.9%] other race and ethnicity). The ML models demonstrated excellent performance in the internal testing subset (n = 82420) (C statistic, 0.81 for Black patients and 0.82 for non-Black patients) and in the real-world-like cohort with less than 50% missingness on covariates (n = 553506; C statistic, 0.74 for Black patients and 0.75 for non-Black patients). In the external validation cohort (ARIC registry; n = 1205 Black patients and 2264 non-Black patients), ML models demonstrated high discrimination and adequate calibration (C statistic, 0.79 and 0.80, respectively). Furthermore, the performance of the ML models was superior to the traditional GWTG-HF risk score model (C index, 0.69 for both race groups) and other rederived logistic regression models using race as a covariate. The performance of the ML models was identical using the race-specific and race-agnostic approaches in the GWTG-HF and external validation cohorts. In the GWTG-HF cohort, the addition of zip code-level SDOH parameters to the ML model with clinical covariates only was associated with better discrimination, prognostic utility (assessed using decision curves), and model reclassification metrics in Black patients (net reclassification improvement, 0.22 [95% CI, 0.14-0.30]; P <.001) but not in non-Black patients. Conclusions and Relevance: ML models for HF mortality demonstrated superior performance to the traditional and rederived logistic regressions models using race as a covariate. The addition of SDOH parameters improved the prognostic utility of prediction models in Black patients but not non-Black patients in the GWTG-HF registry.

Original language	English (US)
Pages (from-to)	844-854
Number of pages	11
Journal	JAMA Cardiology
Volume	7
Issue number	8
DOIs	https://doi.org/10.1001/jamacardio.2022.1900
State	Published - Aug 2022

ASJC Scopus subject areas

Cardiology and Cardiovascular Medicine

Access to Document

10.1001/jamacardio.2022.1900

Cite this

Segar, M. W., Hall, J. L., Jhund, P. S., Powell-Wiley, T. M., Morris, A. A., Kao, D., Fonarow, G. C., Hernandez, R., Ibrahim, N. E., Rutan, C., Navar, A. M., Stevens, L. M., & Pandey, A. (2022). Machine Learning-Based Models Incorporating Social Determinants of Health vs Traditional Models for Predicting In-Hospital Mortality in Patients with Heart Failure. JAMA Cardiology, 7(8), 844-854. https://doi.org/10.1001/jamacardio.2022.1900

Segar, MW, Hall, JL, Jhund, PS, Powell-Wiley, TM, Morris, AA, Kao, D, Fonarow, GC, Hernandez, R, Ibrahim, NE, Rutan, C, Navar, AM, Stevens, LM & Pandey, A 2022, 'Machine Learning-Based Models Incorporating Social Determinants of Health vs Traditional Models for Predicting In-Hospital Mortality in Patients with Heart Failure', JAMA Cardiology, vol. 7, no. 8, pp. 844-854. https://doi.org/10.1001/jamacardio.2022.1900

@article{2553370b64c3496c8384fc03db68c813,

title = "Machine Learning-Based Models Incorporating Social Determinants of Health vs Traditional Models for Predicting In-Hospital Mortality in Patients with Heart Failure",

abstract = "Importance: Traditional models for predicting in-hospital mortality for patients with heart failure (HF) have used logistic regression and do not account for social determinants of health (SDOH). Objective: To develop and validate novel machine learning (ML) models for HF mortality that incorporate SDOH. Design, Setting, and Participants: This retrospective study used the data from the Get With The Guidelines-Heart Failure (GWTG-HF) registry to identify HF hospitalizations between January 1, 2010, and December 31, 2020. The study included patients with acute decompensated HF who were hospitalized at the GWTG-HF participating centers during the study period. Data analysis was performed January 6, 2021, to April 26, 2022. External validation was performed in the hospitalization cohort from the Atherosclerosis Risk in Communities (ARIC) study between 2005 and 2014. Main Outcomes and Measures: Random forest-based ML approaches were used to develop race-specific and race-agnostic models for predicting in-hospital mortality. Performance was assessed using C index (discrimination), regression slopes for observed vs predicted mortality rates (calibration), and decision curves for prognostic utility. Results: The training data set included 123634 hospitalized patients with HF who were enrolled in the GWTG-HF registry (mean [SD] age, 71 [13] years; 58356 [47.2%] female individuals; 65278 [52.8%] male individuals. Patients were analyzed in 2 categories: Black (23453 [19.0%]) and non-Black (2121 [2.1%] Asian; 91154 [91.0%] White, and 6906 [6.9%] other race and ethnicity). The ML models demonstrated excellent performance in the internal testing subset (n = 82420) (C statistic, 0.81 for Black patients and 0.82 for non-Black patients) and in the real-world-like cohort with less than 50% missingness on covariates (n = 553506; C statistic, 0.74 for Black patients and 0.75 for non-Black patients). In the external validation cohort (ARIC registry; n = 1205 Black patients and 2264 non-Black patients), ML models demonstrated high discrimination and adequate calibration (C statistic, 0.79 and 0.80, respectively). Furthermore, the performance of the ML models was superior to the traditional GWTG-HF risk score model (C index, 0.69 for both race groups) and other rederived logistic regression models using race as a covariate. The performance of the ML models was identical using the race-specific and race-agnostic approaches in the GWTG-HF and external validation cohorts. In the GWTG-HF cohort, the addition of zip code-level SDOH parameters to the ML model with clinical covariates only was associated with better discrimination, prognostic utility (assessed using decision curves), and model reclassification metrics in Black patients (net reclassification improvement, 0.22 [95% CI, 0.14-0.30]; P <.001) but not in non-Black patients. Conclusions and Relevance: ML models for HF mortality demonstrated superior performance to the traditional and rederived logistic regressions models using race as a covariate. The addition of SDOH parameters improved the prognostic utility of prediction models in Black patients but not non-Black patients in the GWTG-HF registry.",

author = "Segar, {Matthew W.} and Hall, {Jennifer L.} and Jhund, {Pardeep S.} and Powell-Wiley, {Tiffany M.} and Morris, {Alanna A.} and David Kao and Fonarow, {Gregg C.} and Rosalba Hernandez and Ibrahim, {Nasrien E.} and Christine Rutan and Navar, {Ann Marie} and Stevens, {Laura M.} and Ambarish Pandey",

year = "2022",

month = aug,

doi = "10.1001/jamacardio.2022.1900",

language = "English (US)",

volume = "7",

pages = "844--854",

journal = "JAMA Cardiology",

issn = "2380-6583",

publisher = "American Medical Association",

number = "8",

}

TY - JOUR

T1 - Machine Learning-Based Models Incorporating Social Determinants of Health vs Traditional Models for Predicting In-Hospital Mortality in Patients with Heart Failure

AU - Segar, Matthew W.

AU - Hall, Jennifer L.

AU - Jhund, Pardeep S.

AU - Powell-Wiley, Tiffany M.

AU - Morris, Alanna A.

AU - Kao, David

AU - Fonarow, Gregg C.

AU - Hernandez, Rosalba

AU - Ibrahim, Nasrien E.

AU - Rutan, Christine

AU - Navar, Ann Marie

AU - Stevens, Laura M.

AU - Pandey, Ambarish

PY - 2022/8

Y1 - 2022/8

N2 - Importance: Traditional models for predicting in-hospital mortality for patients with heart failure (HF) have used logistic regression and do not account for social determinants of health (SDOH). Objective: To develop and validate novel machine learning (ML) models for HF mortality that incorporate SDOH. Design, Setting, and Participants: This retrospective study used the data from the Get With The Guidelines-Heart Failure (GWTG-HF) registry to identify HF hospitalizations between January 1, 2010, and December 31, 2020. The study included patients with acute decompensated HF who were hospitalized at the GWTG-HF participating centers during the study period. Data analysis was performed January 6, 2021, to April 26, 2022. External validation was performed in the hospitalization cohort from the Atherosclerosis Risk in Communities (ARIC) study between 2005 and 2014. Main Outcomes and Measures: Random forest-based ML approaches were used to develop race-specific and race-agnostic models for predicting in-hospital mortality. Performance was assessed using C index (discrimination), regression slopes for observed vs predicted mortality rates (calibration), and decision curves for prognostic utility. Results: The training data set included 123634 hospitalized patients with HF who were enrolled in the GWTG-HF registry (mean [SD] age, 71 [13] years; 58356 [47.2%] female individuals; 65278 [52.8%] male individuals. Patients were analyzed in 2 categories: Black (23453 [19.0%]) and non-Black (2121 [2.1%] Asian; 91154 [91.0%] White, and 6906 [6.9%] other race and ethnicity). The ML models demonstrated excellent performance in the internal testing subset (n = 82420) (C statistic, 0.81 for Black patients and 0.82 for non-Black patients) and in the real-world-like cohort with less than 50% missingness on covariates (n = 553506; C statistic, 0.74 for Black patients and 0.75 for non-Black patients). In the external validation cohort (ARIC registry; n = 1205 Black patients and 2264 non-Black patients), ML models demonstrated high discrimination and adequate calibration (C statistic, 0.79 and 0.80, respectively). Furthermore, the performance of the ML models was superior to the traditional GWTG-HF risk score model (C index, 0.69 for both race groups) and other rederived logistic regression models using race as a covariate. The performance of the ML models was identical using the race-specific and race-agnostic approaches in the GWTG-HF and external validation cohorts. In the GWTG-HF cohort, the addition of zip code-level SDOH parameters to the ML model with clinical covariates only was associated with better discrimination, prognostic utility (assessed using decision curves), and model reclassification metrics in Black patients (net reclassification improvement, 0.22 [95% CI, 0.14-0.30]; P <.001) but not in non-Black patients. Conclusions and Relevance: ML models for HF mortality demonstrated superior performance to the traditional and rederived logistic regressions models using race as a covariate. The addition of SDOH parameters improved the prognostic utility of prediction models in Black patients but not non-Black patients in the GWTG-HF registry.

AB - Importance: Traditional models for predicting in-hospital mortality for patients with heart failure (HF) have used logistic regression and do not account for social determinants of health (SDOH). Objective: To develop and validate novel machine learning (ML) models for HF mortality that incorporate SDOH. Design, Setting, and Participants: This retrospective study used the data from the Get With The Guidelines-Heart Failure (GWTG-HF) registry to identify HF hospitalizations between January 1, 2010, and December 31, 2020. The study included patients with acute decompensated HF who were hospitalized at the GWTG-HF participating centers during the study period. Data analysis was performed January 6, 2021, to April 26, 2022. External validation was performed in the hospitalization cohort from the Atherosclerosis Risk in Communities (ARIC) study between 2005 and 2014. Main Outcomes and Measures: Random forest-based ML approaches were used to develop race-specific and race-agnostic models for predicting in-hospital mortality. Performance was assessed using C index (discrimination), regression slopes for observed vs predicted mortality rates (calibration), and decision curves for prognostic utility. Results: The training data set included 123634 hospitalized patients with HF who were enrolled in the GWTG-HF registry (mean [SD] age, 71 [13] years; 58356 [47.2%] female individuals; 65278 [52.8%] male individuals. Patients were analyzed in 2 categories: Black (23453 [19.0%]) and non-Black (2121 [2.1%] Asian; 91154 [91.0%] White, and 6906 [6.9%] other race and ethnicity). The ML models demonstrated excellent performance in the internal testing subset (n = 82420) (C statistic, 0.81 for Black patients and 0.82 for non-Black patients) and in the real-world-like cohort with less than 50% missingness on covariates (n = 553506; C statistic, 0.74 for Black patients and 0.75 for non-Black patients). In the external validation cohort (ARIC registry; n = 1205 Black patients and 2264 non-Black patients), ML models demonstrated high discrimination and adequate calibration (C statistic, 0.79 and 0.80, respectively). Furthermore, the performance of the ML models was superior to the traditional GWTG-HF risk score model (C index, 0.69 for both race groups) and other rederived logistic regression models using race as a covariate. The performance of the ML models was identical using the race-specific and race-agnostic approaches in the GWTG-HF and external validation cohorts. In the GWTG-HF cohort, the addition of zip code-level SDOH parameters to the ML model with clinical covariates only was associated with better discrimination, prognostic utility (assessed using decision curves), and model reclassification metrics in Black patients (net reclassification improvement, 0.22 [95% CI, 0.14-0.30]; P <.001) but not in non-Black patients. Conclusions and Relevance: ML models for HF mortality demonstrated superior performance to the traditional and rederived logistic regressions models using race as a covariate. The addition of SDOH parameters improved the prognostic utility of prediction models in Black patients but not non-Black patients in the GWTG-HF registry.

UR - http://www.scopus.com/inward/record.url?scp=85133972554&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85133972554&partnerID=8YFLogxK

U2 - 10.1001/jamacardio.2022.1900

DO - 10.1001/jamacardio.2022.1900

M3 - Article

C2 - 35793094

AN - SCOPUS:85133972554

SN - 2380-6583

VL - 7

SP - 844

EP - 854

JO - JAMA Cardiology

JF - JAMA Cardiology

IS - 8

ER -

Machine Learning-Based Models Incorporating Social Determinants of Health vs Traditional Models for Predicting In-Hospital Mortality in Patients with Heart Failure

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this