Snomed ct concept hierarchies for computable clinical phenotypes from electronic health record data: Comparison of intensional versus extensional value sets

Ling Chu, Vaishnavi Kannan, Mujeeb A. Basit, Diane J. Schaeflein, Adolfo R. Ortuzar, Jimmie F. Glorioso, Joel R. Buchanan, Duwayne L. Willett

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


Background: Defining clinical phenotypes from electronic health record (EHR)-derived data proves crucial for clinical decision support, population health endeavors, and translational research. EHR diagnoses now commonly draw from a finely grained clinical terminology-either native SNOMED CT or a vendor-supplied terminology mapped to SNOMED CT concepts as the standard for EHR interoperability. Accordingly, electronic clinical quality measures (eCQMs) increasingly define clinical phenotypes with SNOMED CT value sets. The work of creating and maintaining list-based value sets proves daunting, as does insuring that their contents accurately represent the clinically intended condition. Objective: The goal of the research was to compare an intensional (concept hierarchy-based) versus extensional (list-based) value set approach to defining clinical phenotypes using SNOMED CT-encoded data from EHRs by evaluating value set conciseness, time to create, and completeness. Methods: Starting from published Centers for Medicare and Medicaid Services (CMS) high-priority eCQMs, we selected 10 clinical conditions referenced by those eCQMs. For each, the published SNOMED CT list-based (extensional) value set was downloaded from the Value Set Authority Center (VSAC). Ten corresponding SNOMED CT hierarchy-based intensional value sets for the same conditions were identified within our EHR. From each hierarchy-based intensional value set, an exactly equivalent full extensional value set was derived enumerating all included descendant SNOMED CT concepts. Comparisons were then made between (1) VSAC-downloaded list-based (extensional) value sets, (2) corresponding hierarchy-based intensional value sets for the same conditions, and (3) derived list-based (extensional) value sets exactly equivalent to the hierarchy-based intensional value sets. Value set conciseness was assessed by the number of SNOMED CT concepts needed for definition. Time to construct the value sets for local use was measured. Value set completeness was assessed by comparing contents of the downloaded extensional versus intensional value sets. Two measures of content completeness were made: For individual SNOMED CT concepts and for the mapped diagnosis clinical terms available for selection within the EHR by clinicians. Results: The 10 hierarchy-based intensional value sets proved far simpler and faster to construct than exactly equivalent derived extensional value set lists, requiring a median 3 versus 78 concepts to define and 5 versus 37 minutes to build. The hierarchy-based intensional value sets also proved more complete: In comparison, the 10 downloaded 2018 extensional value sets contained a median of just 35% of the intensional value sets' SNOMED CT concepts and 65% of mapped EHR clinical terms. Conclusions: In the EHR era, defining conditions preferentially should employ SNOMED CT concept hierarchy-based (intensional) value sets rather than extensional lists. By doing so, clinical guideline and eCQM authors can more readily engage specialists in vetting condition subtypes to include and exclude, and streamline broad EHR implementation of condition-specific decision support promoting guideline adherence for patient benefit.

Original languageEnglish (US)
Article numbere11487
JournalJMIR Medical Informatics
Issue number1
StatePublished - Jan 2019


  • Clinical phenotypes
  • Population health
  • Pragmatic clinical study
  • Value sets

ASJC Scopus subject areas

  • Health Informatics
  • Health Information Management


Dive into the research topics of 'Snomed ct concept hierarchies for computable clinical phenotypes from electronic health record data: Comparison of intensional versus extensional value sets'. Together they form a unique fingerprint.

Cite this