Evaluation of an Algorithm for Identifying Ocular Conditions in Electronic Health Record Data

Joshua D. Stein; Moshiur Rahman; Chris Andrews; Joshua R. Ehrlich; Shivani Kamat; Manjool Shah; Erin A. Boese; Maria A. Woodward; Jeff Cowall; Edward H. Trager; Prabha Narayanaswamy; David A. Hanauer

doi:10.1001/jamaophthalmol.2018.7051

Evaluation of an Algorithm for Identifying Ocular Conditions in Electronic Health Record Data

Joshua D. Stein, Moshiur Rahman, Chris Andrews, Joshua R. Ehrlich, Shivani Kamat, Manjool Shah, Erin A. Boese, Maria A. Woodward, Jeff Cowall, Edward H. Trager, Prabha Narayanaswamy, David A. Hanauer

Research output: Contribution to journal › Article › peer-review

37 Scopus citations

Abstract

Importance: For research involving big data, researchers must accurately identify patients with ocular diseases or phenotypes of interest. Reliance on administrative billing codes alone for this purpose is limiting. Objective: To develop a method to accurately identify the presence or absence of ocular conditions of interest using electronic health record (EHR) data. Design, Setting, and Participants: This study is a retrospective analysis of the EHR data of patients (n = 122339) in the Sight Outcomes Research Collaborative Ophthalmology Data Repository who received eye care at participating academic medical centers between August 1, 2012, and August 31, 2017. An algorithm that searches structured and unstructured (free-text) EHR data for conditions of interest was developed and then tested to determine how well it could detect the presence or absence of exfoliation syndrome (XFS). The algorithm was trained to search for evidence of XFS among a sample of patients with and without XFS (n = 200) by reviewing International Classification of Diseases, Ninth Revision or International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-9 or ICD-10) billing codes, the patient's problem list, and text within the ocular examination section and unstructured (free-text) data in the EHR. The likelihood that each patient had XFS was estimated using logistic least absolute shrinkage and selection operator (LASSO) regression. The EHR data of all patients were run through the algorithm to generate an XFS probability score for each patient. The algorithm was validated with review of EHRs by glaucoma specialists. Main Outcomes and Measures: Positive predictive value (PPV) and negative predictive value (NPV) of the algorithm were computed as the proportion of patients correctly classified with XFS or without XFS. Results: This study included 122 339 patients, with a mean (SD) age of 52.4 (25.1) years. Of these patients, 69 002 (56.4%) were female and 99 579 (81.4%) were white. The algorithm assigned a less than 10% probability of XFS for 121 085 patients (99.0%) as well as an XFS probability score of more than 75% for 543 patients (0.4%), more than 90% for 353 patients (0.3%), and more than 99% for 83 patients (0.07%). Validated by glaucoma specialists, the algorithm had a PPV of 95.0% (95% CI, 89.5%-97.7%) and an NPV of 100% (95% CI, 91.2%-100%). When there was ICD-9 or ICD-10 billing code documentation of XFS, in 86% or 96% of the records, respectively, evidence of XFS was also recorded elsewhere in the EHR. Conversely, when there was clinical examination or free-text evidence of XFS, it was documented with ICD-9 codes only approximately 40% of the time and even less often with ICD-10 codes. Conclusions and Relevance: The algorithm developed, tested, and validated in this study appears to be better at identifying the presence or absence of XFS in EHR data than the conventional approach of assessing only billing codes; such an algorithm may enhance the ability of investigators to use EHR data to study patients with ocular diseases.

Original language	English (US)
Pages (from-to)	491-497
Number of pages	7
Journal	JAMA Ophthalmology
Volume	137
Issue number	5
DOIs	https://doi.org/10.1001/jamaophthalmol.2018.7051
State	Published - May 2019
Externally published	Yes

ASJC Scopus subject areas

Ophthalmology

Access to Document

10.1001/jamaophthalmol.2018.7051

Cite this

@article{50a7f05cd53b480586fabc4de7c77ceb,

title = "Evaluation of an Algorithm for Identifying Ocular Conditions in Electronic Health Record Data",

abstract = "Importance: For research involving big data, researchers must accurately identify patients with ocular diseases or phenotypes of interest. Reliance on administrative billing codes alone for this purpose is limiting. Objective: To develop a method to accurately identify the presence or absence of ocular conditions of interest using electronic health record (EHR) data. Design, Setting, and Participants: This study is a retrospective analysis of the EHR data of patients (n = 122339) in the Sight Outcomes Research Collaborative Ophthalmology Data Repository who received eye care at participating academic medical centers between August 1, 2012, and August 31, 2017. An algorithm that searches structured and unstructured (free-text) EHR data for conditions of interest was developed and then tested to determine how well it could detect the presence or absence of exfoliation syndrome (XFS). The algorithm was trained to search for evidence of XFS among a sample of patients with and without XFS (n = 200) by reviewing International Classification of Diseases, Ninth Revision or International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-9 or ICD-10) billing codes, the patient's problem list, and text within the ocular examination section and unstructured (free-text) data in the EHR. The likelihood that each patient had XFS was estimated using logistic least absolute shrinkage and selection operator (LASSO) regression. The EHR data of all patients were run through the algorithm to generate an XFS probability score for each patient. The algorithm was validated with review of EHRs by glaucoma specialists. Main Outcomes and Measures: Positive predictive value (PPV) and negative predictive value (NPV) of the algorithm were computed as the proportion of patients correctly classified with XFS or without XFS. Results: This study included 122 339 patients, with a mean (SD) age of 52.4 (25.1) years. Of these patients, 69 002 (56.4%) were female and 99 579 (81.4%) were white. The algorithm assigned a less than 10% probability of XFS for 121 085 patients (99.0%) as well as an XFS probability score of more than 75% for 543 patients (0.4%), more than 90% for 353 patients (0.3%), and more than 99% for 83 patients (0.07%). Validated by glaucoma specialists, the algorithm had a PPV of 95.0% (95% CI, 89.5%-97.7%) and an NPV of 100% (95% CI, 91.2%-100%). When there was ICD-9 or ICD-10 billing code documentation of XFS, in 86% or 96% of the records, respectively, evidence of XFS was also recorded elsewhere in the EHR. Conversely, when there was clinical examination or free-text evidence of XFS, it was documented with ICD-9 codes only approximately 40% of the time and even less often with ICD-10 codes. Conclusions and Relevance: The algorithm developed, tested, and validated in this study appears to be better at identifying the presence or absence of XFS in EHR data than the conventional approach of assessing only billing codes; such an algorithm may enhance the ability of investigators to use EHR data to study patients with ocular diseases.",

author = "Stein, {Joshua D.} and Moshiur Rahman and Chris Andrews and Ehrlich, {Joshua R.} and Shivani Kamat and Manjool Shah and Boese, {Erin A.} and Woodward, {Maria A.} and Jeff Cowall and Trager, {Edward H.} and Prabha Narayanaswamy and Hanauer, {David A.}",

note = "Funding Information: Funding/Support: This study was supported by grants R01 EY026641 (Dr Stein), K23EY027848 (Dr Ehrlich), and K23EY023596 (Dr Woodward) from the National Institutes of Health; a grant from the Lighthouse Guild; and an unrestricted grant to the University of Michigan Department of Ophthalmology and Visual Sciences from Research to Prevent Blindness. Funding Information: grants from NIH, grants from Lighthouse Guild, and grants from Research to Prevent Blindness during the conduct of the study. Dr Ehrlich reported grants from National Institutes of Health during the conduct of the study; grants from Blue Cross Blue Shield of Michigan Foundation and grants from American Society of Cataract and Refractive Surgery Foundation outside the submitted work. Dr Shah reported personal fees from Allergan and personal fees from Glaukos outside the submitted work. Dr Woodward reported grants from National Eye Institute during the conduct of the study. No other disclosures were reported. Publisher Copyright: {\textcopyright} 2019 American Medical Association. All rights reserved.",

year = "2019",

month = may,

doi = "10.1001/jamaophthalmol.2018.7051",

language = "English (US)",

volume = "137",

pages = "491--497",

journal = "JAMA Ophthalmology",

issn = "2168-6165",

publisher = "American Medical Association",

number = "5",

}

TY - JOUR

T1 - Evaluation of an Algorithm for Identifying Ocular Conditions in Electronic Health Record Data

AU - Stein, Joshua D.

AU - Rahman, Moshiur

AU - Andrews, Chris

AU - Ehrlich, Joshua R.

AU - Kamat, Shivani

AU - Shah, Manjool

AU - Boese, Erin A.

AU - Woodward, Maria A.

AU - Cowall, Jeff

AU - Trager, Edward H.

AU - Narayanaswamy, Prabha

AU - Hanauer, David A.

N1 - Funding Information: Funding/Support: This study was supported by grants R01 EY026641 (Dr Stein), K23EY027848 (Dr Ehrlich), and K23EY023596 (Dr Woodward) from the National Institutes of Health; a grant from the Lighthouse Guild; and an unrestricted grant to the University of Michigan Department of Ophthalmology and Visual Sciences from Research to Prevent Blindness. Funding Information: grants from NIH, grants from Lighthouse Guild, and grants from Research to Prevent Blindness during the conduct of the study. Dr Ehrlich reported grants from National Institutes of Health during the conduct of the study; grants from Blue Cross Blue Shield of Michigan Foundation and grants from American Society of Cataract and Refractive Surgery Foundation outside the submitted work. Dr Shah reported personal fees from Allergan and personal fees from Glaukos outside the submitted work. Dr Woodward reported grants from National Eye Institute during the conduct of the study. No other disclosures were reported. Publisher Copyright: © 2019 American Medical Association. All rights reserved.

PY - 2019/5

Y1 - 2019/5

N2 - Importance: For research involving big data, researchers must accurately identify patients with ocular diseases or phenotypes of interest. Reliance on administrative billing codes alone for this purpose is limiting. Objective: To develop a method to accurately identify the presence or absence of ocular conditions of interest using electronic health record (EHR) data. Design, Setting, and Participants: This study is a retrospective analysis of the EHR data of patients (n = 122339) in the Sight Outcomes Research Collaborative Ophthalmology Data Repository who received eye care at participating academic medical centers between August 1, 2012, and August 31, 2017. An algorithm that searches structured and unstructured (free-text) EHR data for conditions of interest was developed and then tested to determine how well it could detect the presence or absence of exfoliation syndrome (XFS). The algorithm was trained to search for evidence of XFS among a sample of patients with and without XFS (n = 200) by reviewing International Classification of Diseases, Ninth Revision or International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-9 or ICD-10) billing codes, the patient's problem list, and text within the ocular examination section and unstructured (free-text) data in the EHR. The likelihood that each patient had XFS was estimated using logistic least absolute shrinkage and selection operator (LASSO) regression. The EHR data of all patients were run through the algorithm to generate an XFS probability score for each patient. The algorithm was validated with review of EHRs by glaucoma specialists. Main Outcomes and Measures: Positive predictive value (PPV) and negative predictive value (NPV) of the algorithm were computed as the proportion of patients correctly classified with XFS or without XFS. Results: This study included 122 339 patients, with a mean (SD) age of 52.4 (25.1) years. Of these patients, 69 002 (56.4%) were female and 99 579 (81.4%) were white. The algorithm assigned a less than 10% probability of XFS for 121 085 patients (99.0%) as well as an XFS probability score of more than 75% for 543 patients (0.4%), more than 90% for 353 patients (0.3%), and more than 99% for 83 patients (0.07%). Validated by glaucoma specialists, the algorithm had a PPV of 95.0% (95% CI, 89.5%-97.7%) and an NPV of 100% (95% CI, 91.2%-100%). When there was ICD-9 or ICD-10 billing code documentation of XFS, in 86% or 96% of the records, respectively, evidence of XFS was also recorded elsewhere in the EHR. Conversely, when there was clinical examination or free-text evidence of XFS, it was documented with ICD-9 codes only approximately 40% of the time and even less often with ICD-10 codes. Conclusions and Relevance: The algorithm developed, tested, and validated in this study appears to be better at identifying the presence or absence of XFS in EHR data than the conventional approach of assessing only billing codes; such an algorithm may enhance the ability of investigators to use EHR data to study patients with ocular diseases.

AB - Importance: For research involving big data, researchers must accurately identify patients with ocular diseases or phenotypes of interest. Reliance on administrative billing codes alone for this purpose is limiting. Objective: To develop a method to accurately identify the presence or absence of ocular conditions of interest using electronic health record (EHR) data. Design, Setting, and Participants: This study is a retrospective analysis of the EHR data of patients (n = 122339) in the Sight Outcomes Research Collaborative Ophthalmology Data Repository who received eye care at participating academic medical centers between August 1, 2012, and August 31, 2017. An algorithm that searches structured and unstructured (free-text) EHR data for conditions of interest was developed and then tested to determine how well it could detect the presence or absence of exfoliation syndrome (XFS). The algorithm was trained to search for evidence of XFS among a sample of patients with and without XFS (n = 200) by reviewing International Classification of Diseases, Ninth Revision or International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-9 or ICD-10) billing codes, the patient's problem list, and text within the ocular examination section and unstructured (free-text) data in the EHR. The likelihood that each patient had XFS was estimated using logistic least absolute shrinkage and selection operator (LASSO) regression. The EHR data of all patients were run through the algorithm to generate an XFS probability score for each patient. The algorithm was validated with review of EHRs by glaucoma specialists. Main Outcomes and Measures: Positive predictive value (PPV) and negative predictive value (NPV) of the algorithm were computed as the proportion of patients correctly classified with XFS or without XFS. Results: This study included 122 339 patients, with a mean (SD) age of 52.4 (25.1) years. Of these patients, 69 002 (56.4%) were female and 99 579 (81.4%) were white. The algorithm assigned a less than 10% probability of XFS for 121 085 patients (99.0%) as well as an XFS probability score of more than 75% for 543 patients (0.4%), more than 90% for 353 patients (0.3%), and more than 99% for 83 patients (0.07%). Validated by glaucoma specialists, the algorithm had a PPV of 95.0% (95% CI, 89.5%-97.7%) and an NPV of 100% (95% CI, 91.2%-100%). When there was ICD-9 or ICD-10 billing code documentation of XFS, in 86% or 96% of the records, respectively, evidence of XFS was also recorded elsewhere in the EHR. Conversely, when there was clinical examination or free-text evidence of XFS, it was documented with ICD-9 codes only approximately 40% of the time and even less often with ICD-10 codes. Conclusions and Relevance: The algorithm developed, tested, and validated in this study appears to be better at identifying the presence or absence of XFS in EHR data than the conventional approach of assessing only billing codes; such an algorithm may enhance the ability of investigators to use EHR data to study patients with ocular diseases.

UR - http://www.scopus.com/inward/record.url?scp=85062045363&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062045363&partnerID=8YFLogxK

U2 - 10.1001/jamaophthalmol.2018.7051

DO - 10.1001/jamaophthalmol.2018.7051

M3 - Article

C2 - 30789656

AN - SCOPUS:85062045363

SN - 2168-6165

VL - 137

SP - 491

EP - 497

JO - JAMA Ophthalmology

JF - JAMA Ophthalmology

IS - 5

ER -

Evaluation of an Algorithm for Identifying Ocular Conditions in Electronic Health Record Data

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this