Development and Validation of Machine Learning-Based Race-Specific Models to Predict 10-Year Risk of Heart Failure: A Multicohort Analysis

Matthew W. Segar; Byron C. Jaeger; Kershaw V. Patel; Vijay Nambi; Chiadi E. Ndumele; Adolfo Correa; Javed Butler; Alvin Chandra; Colby Ayers; Shreya Rao; Alana A. Lewis; Laura M. Raffield; Carlos J. Rodriguez; Erin D. Michos; Christie M. Ballantyne; Michael E. Hall; Robert J. Mentz; James A. De Lemos; Ambarish Pandey

doi:10.1161/CIRCULATIONAHA.120.053134

Development and Validation of Machine Learning-Based Race-Specific Models to Predict 10-Year Risk of Heart Failure: A Multicohort Analysis

Matthew W. Segar, Byron C. Jaeger, Kershaw V. Patel, Vijay Nambi, Chiadi E. Ndumele, Adolfo Correa, Javed Butler, Alvin Chandra, Colby Ayers, Shreya Rao, Alana A. Lewis, Laura M. Raffield, Carlos J. Rodriguez, Erin D. Michos, Christie M. Ballantyne, Michael E. Hall, Robert J. Mentz, James A. De Lemos, Ambarish Pandey

Research output: Contribution to journal › Article › peer-review

51 Scopus citations

Abstract

Background: Heart failure (HF) risk and the underlying risk factors vary by race. Traditional models for HF risk prediction treat race as a covariate in risk prediction and do not account for significant parameters such as cardiac biomarkers. Machine learning (ML) may offer advantages over traditional modeling techniques to develop race-specific HF risk prediction models and to elucidate important contributors of HF development across races. Methods: We performed a retrospective analysis of 4 large, community cohort studies (ARIC [Atherosclerosis Risk in Communities], DHS [Dallas Heart Study], JHS [Jackson Heart Study], and MESA [Multi-Ethnic Study of Atherosclerosis]) with adjudicated HF events. The study included participants who were >40 years of age and free of HF at baseline. Race-specific ML models for HF risk prediction were developed in the JHS cohort (for Black race-specific model) and White adults from ARIC (for White race-specific model). The models included 39 candidate variables across demographic, anthropometric, medical history, laboratory, and electrocardiographic domains. The ML models were externally validated and compared with prior established traditional and non-race-specific ML models in race-specific subgroups of the pooled MESA/DHS cohort and Black participants of ARIC. The Harrell C-index and Greenwood-Nam-D'Agostino χ2 tests were used to assess discrimination and calibration, respectively. Results: The ML models had excellent discrimination in the derivation cohorts for Black (n=4141 in JHS, C-index=0.88) and White (n=7858 in ARIC, C-index=0.89) participants. In the external validation cohorts, the race-specific ML model demonstrated adequate calibration and superior discrimination (Black individuals, C-index=0.80-0.83; White individuals, C-index=0.82) compared with established HF risk models or with non-race-specific ML models derived with race included as a covariate. Among the risk factors, natriuretic peptide levels were the most important predictor of HF risk across both races, followed by troponin levels in Black and ECG-based Cornell voltage in White individuals. Other key predictors of HF risk among Black individuals were glycemic parameters and socioeconomic factors. In contrast, prevalent cardiovascular disease and traditional cardiovascular risk factors were stronger predictors of HF risk in White adults. Conclusions: Race-specific and ML-based HF risk models that integrate clinical, laboratory, and biomarker data demonstrated superior performance compared with traditional HF risk and non-race-specific ML models. This approach identifies distinct race-specific contributors of HF.

Original language	English (US)
Pages (from-to)	2370-2383
Number of pages	14
Journal	Circulation
Volume	143
Issue number	24
DOIs	https://doi.org/10.1161/CIRCULATIONAHA.120.053134
State	Published - Jun 15 2021

Keywords

epidemiology
heart failure
machine learning
risk

ASJC Scopus subject areas

Cardiology and Cardiovascular Medicine
Physiology (medical)

Access to Document

10.1161/CIRCULATIONAHA.120.053134

Cite this

Segar, M. W., Jaeger, B. C., Patel, K. V., Nambi, V., Ndumele, C. E., Correa, A., Butler, J., Chandra, A., Ayers, C., Rao, S., Lewis, A. A., Raffield, L. M., Rodriguez, C. J., Michos, E. D., Ballantyne, C. M., Hall, M. E., Mentz, R. J., De Lemos, J. A., & Pandey, A. (2021). Development and Validation of Machine Learning-Based Race-Specific Models to Predict 10-Year Risk of Heart Failure: A Multicohort Analysis. Circulation, 143(24), 2370-2383. https://doi.org/10.1161/CIRCULATIONAHA.120.053134

Segar, MW, Jaeger, BC, Patel, KV, Nambi, V, Ndumele, CE, Correa, A, Butler, J, Chandra, A, Ayers, C, Rao, S, Lewis, AA, Raffield, LM, Rodriguez, CJ, Michos, ED, Ballantyne, CM, Hall, ME, Mentz, RJ, De Lemos, JA & Pandey, A 2021, 'Development and Validation of Machine Learning-Based Race-Specific Models to Predict 10-Year Risk of Heart Failure: A Multicohort Analysis', Circulation, vol. 143, no. 24, pp. 2370-2383. https://doi.org/10.1161/CIRCULATIONAHA.120.053134

@article{ac21bd3741744c01a465d32e1e15f6d4,

title = "Development and Validation of Machine Learning-Based Race-Specific Models to Predict 10-Year Risk of Heart Failure: A Multicohort Analysis",

abstract = "Background: Heart failure (HF) risk and the underlying risk factors vary by race. Traditional models for HF risk prediction treat race as a covariate in risk prediction and do not account for significant parameters such as cardiac biomarkers. Machine learning (ML) may offer advantages over traditional modeling techniques to develop race-specific HF risk prediction models and to elucidate important contributors of HF development across races. Methods: We performed a retrospective analysis of 4 large, community cohort studies (ARIC [Atherosclerosis Risk in Communities], DHS [Dallas Heart Study], JHS [Jackson Heart Study], and MESA [Multi-Ethnic Study of Atherosclerosis]) with adjudicated HF events. The study included participants who were >40 years of age and free of HF at baseline. Race-specific ML models for HF risk prediction were developed in the JHS cohort (for Black race-specific model) and White adults from ARIC (for White race-specific model). The models included 39 candidate variables across demographic, anthropometric, medical history, laboratory, and electrocardiographic domains. The ML models were externally validated and compared with prior established traditional and non-race-specific ML models in race-specific subgroups of the pooled MESA/DHS cohort and Black participants of ARIC. The Harrell C-index and Greenwood-Nam-D'Agostino χ2 tests were used to assess discrimination and calibration, respectively. Results: The ML models had excellent discrimination in the derivation cohorts for Black (n=4141 in JHS, C-index=0.88) and White (n=7858 in ARIC, C-index=0.89) participants. In the external validation cohorts, the race-specific ML model demonstrated adequate calibration and superior discrimination (Black individuals, C-index=0.80-0.83; White individuals, C-index=0.82) compared with established HF risk models or with non-race-specific ML models derived with race included as a covariate. Among the risk factors, natriuretic peptide levels were the most important predictor of HF risk across both races, followed by troponin levels in Black and ECG-based Cornell voltage in White individuals. Other key predictors of HF risk among Black individuals were glycemic parameters and socioeconomic factors. In contrast, prevalent cardiovascular disease and traditional cardiovascular risk factors were stronger predictors of HF risk in White adults. Conclusions: Race-specific and ML-based HF risk models that integrate clinical, laboratory, and biomarker data demonstrated superior performance compared with traditional HF risk and non-race-specific ML models. This approach identifies distinct race-specific contributors of HF.",

keywords = "epidemiology, heart failure, machine learning, risk",

author = "Segar, {Matthew W.} and Jaeger, {Byron C.} and Patel, {Kershaw V.} and Vijay Nambi and Ndumele, {Chiadi E.} and Adolfo Correa and Javed Butler and Alvin Chandra and Colby Ayers and Shreya Rao and Lewis, {Alana A.} and Raffield, {Laura M.} and Rodriguez, {Carlos J.} and Michos, {Erin D.} and Ballantyne, {Christie M.} and Hall, {Michael E.} and Mentz, {Robert J.} and {De Lemos}, {James A.} and Ambarish Pandey",

note = "Publisher Copyright: {\textcopyright} 2020 American Heart Association, Inc.",

year = "2021",

month = jun,

day = "15",

doi = "10.1161/CIRCULATIONAHA.120.053134",

language = "English (US)",

volume = "143",

pages = "2370--2383",

journal = "Circulation",

issn = "0009-7322",

publisher = "Lippincott Williams and Wilkins",

number = "24",

}

TY - JOUR

T1 - Development and Validation of Machine Learning-Based Race-Specific Models to Predict 10-Year Risk of Heart Failure

T2 - A Multicohort Analysis

AU - Segar, Matthew W.

AU - Jaeger, Byron C.

AU - Patel, Kershaw V.

AU - Nambi, Vijay

AU - Ndumele, Chiadi E.

AU - Correa, Adolfo

AU - Butler, Javed

AU - Chandra, Alvin

AU - Ayers, Colby

AU - Rao, Shreya

AU - Lewis, Alana A.

AU - Raffield, Laura M.

AU - Rodriguez, Carlos J.

AU - Michos, Erin D.

AU - Ballantyne, Christie M.

AU - Hall, Michael E.

AU - Mentz, Robert J.

AU - De Lemos, James A.

AU - Pandey, Ambarish

PY - 2021/6/15

Y1 - 2021/6/15

N2 - Background: Heart failure (HF) risk and the underlying risk factors vary by race. Traditional models for HF risk prediction treat race as a covariate in risk prediction and do not account for significant parameters such as cardiac biomarkers. Machine learning (ML) may offer advantages over traditional modeling techniques to develop race-specific HF risk prediction models and to elucidate important contributors of HF development across races. Methods: We performed a retrospective analysis of 4 large, community cohort studies (ARIC [Atherosclerosis Risk in Communities], DHS [Dallas Heart Study], JHS [Jackson Heart Study], and MESA [Multi-Ethnic Study of Atherosclerosis]) with adjudicated HF events. The study included participants who were >40 years of age and free of HF at baseline. Race-specific ML models for HF risk prediction were developed in the JHS cohort (for Black race-specific model) and White adults from ARIC (for White race-specific model). The models included 39 candidate variables across demographic, anthropometric, medical history, laboratory, and electrocardiographic domains. The ML models were externally validated and compared with prior established traditional and non-race-specific ML models in race-specific subgroups of the pooled MESA/DHS cohort and Black participants of ARIC. The Harrell C-index and Greenwood-Nam-D'Agostino χ2 tests were used to assess discrimination and calibration, respectively. Results: The ML models had excellent discrimination in the derivation cohorts for Black (n=4141 in JHS, C-index=0.88) and White (n=7858 in ARIC, C-index=0.89) participants. In the external validation cohorts, the race-specific ML model demonstrated adequate calibration and superior discrimination (Black individuals, C-index=0.80-0.83; White individuals, C-index=0.82) compared with established HF risk models or with non-race-specific ML models derived with race included as a covariate. Among the risk factors, natriuretic peptide levels were the most important predictor of HF risk across both races, followed by troponin levels in Black and ECG-based Cornell voltage in White individuals. Other key predictors of HF risk among Black individuals were glycemic parameters and socioeconomic factors. In contrast, prevalent cardiovascular disease and traditional cardiovascular risk factors were stronger predictors of HF risk in White adults. Conclusions: Race-specific and ML-based HF risk models that integrate clinical, laboratory, and biomarker data demonstrated superior performance compared with traditional HF risk and non-race-specific ML models. This approach identifies distinct race-specific contributors of HF.

AB - Background: Heart failure (HF) risk and the underlying risk factors vary by race. Traditional models for HF risk prediction treat race as a covariate in risk prediction and do not account for significant parameters such as cardiac biomarkers. Machine learning (ML) may offer advantages over traditional modeling techniques to develop race-specific HF risk prediction models and to elucidate important contributors of HF development across races. Methods: We performed a retrospective analysis of 4 large, community cohort studies (ARIC [Atherosclerosis Risk in Communities], DHS [Dallas Heart Study], JHS [Jackson Heart Study], and MESA [Multi-Ethnic Study of Atherosclerosis]) with adjudicated HF events. The study included participants who were >40 years of age and free of HF at baseline. Race-specific ML models for HF risk prediction were developed in the JHS cohort (for Black race-specific model) and White adults from ARIC (for White race-specific model). The models included 39 candidate variables across demographic, anthropometric, medical history, laboratory, and electrocardiographic domains. The ML models were externally validated and compared with prior established traditional and non-race-specific ML models in race-specific subgroups of the pooled MESA/DHS cohort and Black participants of ARIC. The Harrell C-index and Greenwood-Nam-D'Agostino χ2 tests were used to assess discrimination and calibration, respectively. Results: The ML models had excellent discrimination in the derivation cohorts for Black (n=4141 in JHS, C-index=0.88) and White (n=7858 in ARIC, C-index=0.89) participants. In the external validation cohorts, the race-specific ML model demonstrated adequate calibration and superior discrimination (Black individuals, C-index=0.80-0.83; White individuals, C-index=0.82) compared with established HF risk models or with non-race-specific ML models derived with race included as a covariate. Among the risk factors, natriuretic peptide levels were the most important predictor of HF risk across both races, followed by troponin levels in Black and ECG-based Cornell voltage in White individuals. Other key predictors of HF risk among Black individuals were glycemic parameters and socioeconomic factors. In contrast, prevalent cardiovascular disease and traditional cardiovascular risk factors were stronger predictors of HF risk in White adults. Conclusions: Race-specific and ML-based HF risk models that integrate clinical, laboratory, and biomarker data demonstrated superior performance compared with traditional HF risk and non-race-specific ML models. This approach identifies distinct race-specific contributors of HF.

KW - epidemiology

KW - heart failure

KW - machine learning

KW - risk

UR - http://www.scopus.com/inward/record.url?scp=85108107154&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85108107154&partnerID=8YFLogxK

U2 - 10.1161/CIRCULATIONAHA.120.053134

DO - 10.1161/CIRCULATIONAHA.120.053134

M3 - Article

C2 - 33845593

AN - SCOPUS:85108107154

SN - 0009-7322

VL - 143

SP - 2370

EP - 2383

JO - Circulation

JF - Circulation

IS - 24

ER -

Development and Validation of Machine Learning-Based Race-Specific Models to Predict 10-Year Risk of Heart Failure: A Multicohort Analysis

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this