Stochastic cross validation

Lu Xu, Hai Yan Fu, Mohammad Goodarzi, Chen Bo Cai, Qiao Bo Yin, Ya Wu, Bang Cheng Tang, Yuan Bin She

Research output: Contribution to journalArticlepeer-review

21 Scopus citations

Abstract

Cross validation (CV) is by far one of the most commonly used methods to estimate model complexity for partial least squares (PLS). In this study, stochastic cross validation (SCV) was proposed as a novel CV strategy, where the percent of left-out objects (PLOO) was defined as a changeable random number. We proposed two SCV strategies, namely, SCV with uniformly distributed PLOO (SCV-U) and SCV with normally distributed PLOO (SCV-N). SCV-U is actually a hybrid of leave-one-out CV (LOOCV), k-fold CV and Monte Carlo CV (MCCV). The rationale behind SCV-N is that the probability of large perturbations of the original training set will be small. SCV is expected to provide more flexibility for data splitting to explore and learn from the data set and evaluate internally a built model. SCV-U and SCV-N were used for PLS calibrations of three real data sets as well as a simulated data set and they were compared with LOOCV, k-fold CV and MCCV. Given a training and external validation set, different CV techniques were repeatedly used to evaluate the optimal model complexity and the prediction results were compared. The results indicate that SCV-U and SCV-N could provide useful alternatives to the traditional CV methods and SCV is less sensitive to the values of PLOO.

Original languageEnglish (US)
Pages (from-to)74-81
Number of pages8
JournalChemometrics and Intelligent Laboratory Systems
Volume175
DOIs
StatePublished - Apr 15 2018

Keywords

  • Cross validation (CV)
  • Model complexity
  • Multivariate calibration
  • Partial least squares (PLS)
  • Stochastic cross validation (SCV)

ASJC Scopus subject areas

  • Software
  • Analytical Chemistry
  • Process Chemistry and Technology
  • Spectroscopy
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Stochastic cross validation'. Together they form a unique fingerprint.

Cite this