Performance evaluation of preprocessing techniques utilizing expert information in multivariate calibration

Sandeep Sharma; Mohammad Goodarzi; Herman Ramon; Wouter Saeys

doi:10.1016/j.talanta.2013.12.053

Performance evaluation of preprocessing techniques utilizing expert information in multivariate calibration

Sandeep Sharma, Mohammad Goodarzi, Herman Ramon, Wouter Saeys

Research output: Contribution to journal › Article › peer-review

21 Scopus citations

Abstract

Partial Least Squares (PLS) regression is one of the most used methods for extracting chemical information from Near Infrared (NIR) spectroscopic measurements. The success of a PLS calibration relies largely on the representativeness of the calibration data set. This is not trivial, because not only the expected variation in the analyte of interest, but also the variation of other contributing factors (interferents) should be included in the calibration data. This also implies that changes in interferent concentrations not covered in the calibration step can deteriorate the prediction ability of the calibration model. Several researchers have suggested that PLS models can be robustified against changes in the interferent structure by incorporating expert knowledge in the preprocessing step with the aim to efficiently filter out the spectral influence of the spectral interferents. However, these methods have not yet been compared against each other. Therefore, in the present study, various preprocessing techniques exploiting expert knowledge were compared on two experimental data sets. In both data sets, the calibration and test set were designed to have a different interferent concentration range. The performance of these techniques was compared to that of preprocessing techniques which do not use any expert knowledge. Using expert knowledge was found to improve the prediction performance for both data sets. For data set-1, the prediction error improved nearly 32% when pure component spectra of the analyte and the interferents were used in the Extended Multiplicative Signal Correction framework. Similarly, for data set-2, nearly 63% improvement in the prediction error was observed when the interferent information was utilized in Spectral Interferent Subtraction preprocessing.

Original language	English (US)
Pages (from-to)	105-112
Number of pages	8
Journal	Talanta
Volume	121
DOIs	https://doi.org/10.1016/j.talanta.2013.12.053
State	Published - Apr 2014

Keywords

Extended Multiplicative Signal Correction
External Parameter Orthogonalization
Generalized Least Squares Weighting
Glucose
Pure component spectrum
Spectral Interference Subtraction

ASJC Scopus subject areas

Analytical Chemistry

Access to Document

10.1016/j.talanta.2013.12.053

Cite this

@article{3a571ff6dde94f869e6e9ae4d96190d6,

title = "Performance evaluation of preprocessing techniques utilizing expert information in multivariate calibration",

abstract = "Partial Least Squares (PLS) regression is one of the most used methods for extracting chemical information from Near Infrared (NIR) spectroscopic measurements. The success of a PLS calibration relies largely on the representativeness of the calibration data set. This is not trivial, because not only the expected variation in the analyte of interest, but also the variation of other contributing factors (interferents) should be included in the calibration data. This also implies that changes in interferent concentrations not covered in the calibration step can deteriorate the prediction ability of the calibration model. Several researchers have suggested that PLS models can be robustified against changes in the interferent structure by incorporating expert knowledge in the preprocessing step with the aim to efficiently filter out the spectral influence of the spectral interferents. However, these methods have not yet been compared against each other. Therefore, in the present study, various preprocessing techniques exploiting expert knowledge were compared on two experimental data sets. In both data sets, the calibration and test set were designed to have a different interferent concentration range. The performance of these techniques was compared to that of preprocessing techniques which do not use any expert knowledge. Using expert knowledge was found to improve the prediction performance for both data sets. For data set-1, the prediction error improved nearly 32% when pure component spectra of the analyte and the interferents were used in the Extended Multiplicative Signal Correction framework. Similarly, for data set-2, nearly 63% improvement in the prediction error was observed when the interferent information was utilized in Spectral Interferent Subtraction preprocessing.",

keywords = "Extended Multiplicative Signal Correction, External Parameter Orthogonalization, Generalized Least Squares Weighting, Glucose, Pure component spectrum, Spectral Interference Subtraction",

author = "Sandeep Sharma and Mohammad Goodarzi and Herman Ramon and Wouter Saeys",

note = "Funding Information: The authors gratefully acknowledge the Institute for the Promotion of Innovation through Science and Technology (IWT-Vlaanderen) for the financial support through the GlucoSens project ( SB-090053 ). The authors also thank Dr. Tormod Naes, Bjorg Narum and Dr. Tomas Isaksson, Nofima, {\AA}s, Norway for providing the powder mixture data set.",

year = "2014",

month = apr,

doi = "10.1016/j.talanta.2013.12.053",

language = "English (US)",

volume = "121",

pages = "105--112",

journal = "Talanta",

issn = "0039-9140",

publisher = "Elsevier",

}

TY - JOUR

T1 - Performance evaluation of preprocessing techniques utilizing expert information in multivariate calibration

AU - Sharma, Sandeep

AU - Goodarzi, Mohammad

AU - Ramon, Herman

AU - Saeys, Wouter

N1 - Funding Information: The authors gratefully acknowledge the Institute for the Promotion of Innovation through Science and Technology (IWT-Vlaanderen) for the financial support through the GlucoSens project ( SB-090053 ). The authors also thank Dr. Tormod Naes, Bjorg Narum and Dr. Tomas Isaksson, Nofima, Ås, Norway for providing the powder mixture data set.

PY - 2014/4

Y1 - 2014/4

N2 - Partial Least Squares (PLS) regression is one of the most used methods for extracting chemical information from Near Infrared (NIR) spectroscopic measurements. The success of a PLS calibration relies largely on the representativeness of the calibration data set. This is not trivial, because not only the expected variation in the analyte of interest, but also the variation of other contributing factors (interferents) should be included in the calibration data. This also implies that changes in interferent concentrations not covered in the calibration step can deteriorate the prediction ability of the calibration model. Several researchers have suggested that PLS models can be robustified against changes in the interferent structure by incorporating expert knowledge in the preprocessing step with the aim to efficiently filter out the spectral influence of the spectral interferents. However, these methods have not yet been compared against each other. Therefore, in the present study, various preprocessing techniques exploiting expert knowledge were compared on two experimental data sets. In both data sets, the calibration and test set were designed to have a different interferent concentration range. The performance of these techniques was compared to that of preprocessing techniques which do not use any expert knowledge. Using expert knowledge was found to improve the prediction performance for both data sets. For data set-1, the prediction error improved nearly 32% when pure component spectra of the analyte and the interferents were used in the Extended Multiplicative Signal Correction framework. Similarly, for data set-2, nearly 63% improvement in the prediction error was observed when the interferent information was utilized in Spectral Interferent Subtraction preprocessing.

AB - Partial Least Squares (PLS) regression is one of the most used methods for extracting chemical information from Near Infrared (NIR) spectroscopic measurements. The success of a PLS calibration relies largely on the representativeness of the calibration data set. This is not trivial, because not only the expected variation in the analyte of interest, but also the variation of other contributing factors (interferents) should be included in the calibration data. This also implies that changes in interferent concentrations not covered in the calibration step can deteriorate the prediction ability of the calibration model. Several researchers have suggested that PLS models can be robustified against changes in the interferent structure by incorporating expert knowledge in the preprocessing step with the aim to efficiently filter out the spectral influence of the spectral interferents. However, these methods have not yet been compared against each other. Therefore, in the present study, various preprocessing techniques exploiting expert knowledge were compared on two experimental data sets. In both data sets, the calibration and test set were designed to have a different interferent concentration range. The performance of these techniques was compared to that of preprocessing techniques which do not use any expert knowledge. Using expert knowledge was found to improve the prediction performance for both data sets. For data set-1, the prediction error improved nearly 32% when pure component spectra of the analyte and the interferents were used in the Extended Multiplicative Signal Correction framework. Similarly, for data set-2, nearly 63% improvement in the prediction error was observed when the interferent information was utilized in Spectral Interferent Subtraction preprocessing.

KW - Extended Multiplicative Signal Correction

KW - External Parameter Orthogonalization

KW - Generalized Least Squares Weighting

KW - Glucose

KW - Pure component spectrum

KW - Spectral Interference Subtraction

UR - http://www.scopus.com/inward/record.url?scp=84892714584&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84892714584&partnerID=8YFLogxK

U2 - 10.1016/j.talanta.2013.12.053

DO - 10.1016/j.talanta.2013.12.053

M3 - Article

C2 - 24607116

AN - SCOPUS:84892714584

SN - 0039-9140

VL - 121

SP - 105

EP - 112

JO - Talanta

JF - Talanta

ER -

Performance evaluation of preprocessing techniques utilizing expert information in multivariate calibration

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this