Statistical Validation Based on Parametric Receiver Operating Characteristic Analysis of Continuous Classification Data

Kelly H. Zou, Simon K. Warfield, Julia R. Fielding, Clare M.C. Tempany, William M. Wells, Michael R. Kaus, Ferenc A. Jolesz, Ron Kikinis

Research output: Contribution to journalArticlepeer-review

26 Scopus citations


Rationale and Objectives. The accuracy of diagnostic test and imaging segmentation is important in clinical practice because it has a direct impact on therapeutic planning. Statistical validations of classification accuracy was conducted based on parametric receiver operating characteristic analysis, illustrated on three radiologic examples. Materials and Methods. Two parametric models were developed for diagnostic or imaging data. Example 1: A semiautomated fractional segmentation algorithm was applied to magnetic resonance imaging of nine cases of brain tumors. The tumor and background pixel data were assumed to have bi-beta distributions. Fractional segmentation was validated against an estimated composite pixel-wise gold standard based on multi-reader manual segmentations. Example 2: The predictive value of 100 cases of spiral computed tomography of ureteral stone sizes, distributed as bi-normal after a nonlinear transformation, under two treatment options received. Example 3: One hundred eighty cases had prostate-specific antigen levels measured in a prospective clinical trial. Radical prostatectomy was performed in all to provide a binary gold standard of local and advanced cancer stages. Prostate-specific antigen level was transformed and modeled by binormal distributions. In all examples, areas under the receiver operating characteristic curves were computed. Results. The areas under the receiver operating characteristic curves were: Example 1: Fractional segmentation of magnefic resonance imaging of brain tumors: meningiomas (0.924-0.984); astrocytomas (0.786-0.986); and other low-grade gliomas (0.896-0.983). Example 3: Ureteral stone size for treatment planning (0.813). Example 2: Prostate-specific anfigen for staging prostate cancer (0.768). Conclusion. All clinical examples yielded fair to excellent accuracy. The validation metric area under the receiver operating characteristic curves may be generalized to evaluating the performances of several continuous classifiers related to imaging.

Original languageEnglish (US)
Pages (from-to)1359-1368
Number of pages10
JournalAcademic radiology
Issue number12
StatePublished - Dec 2003


  • Brain segmentation
  • Computed tomography
  • Genitourinary system
  • Magnetic resonance
  • Prostate specific antigen (PSA)
  • Receiver operating characteristic (ROC) analysis

ASJC Scopus subject areas

  • Radiology Nuclear Medicine and imaging


Dive into the research topics of 'Statistical Validation Based on Parametric Receiver Operating Characteristic Analysis of Continuous Classification Data'. Together they form a unique fingerprint.

Cite this