TY - JOUR
T1 - QSRR modeling for diverse drugs using different feature selection methods coupled with linear and nonlinear regressions
AU - Goodarzi, Mohammad
AU - Jensen, Richard
AU - Vander Heyden, Yvan
PY - 2012
Y1 - 2012
N2 - A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logkw). The overall best model was the SVM one built using descriptors selected by ACO.
AB - A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logkw). The overall best model was the SVM one built using descriptors selected by ACO.
KW - ACO
KW - Chromatographic retention
KW - MLR
KW - QSRR
KW - Relief method
KW - SVM
UR - http://www.scopus.com/inward/record.url?scp=84870238114&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84870238114&partnerID=8YFLogxK
U2 - 10.1016/j.jchromb.2012.01.012
DO - 10.1016/j.jchromb.2012.01.012
M3 - Article
C2 - 22341354
AN - SCOPUS:84870238114
SN - 1570-0232
VL - 910
SP - 84
EP - 94
JO - Journal of Chromatography B: Analytical Technologies in the Biomedical and Life Sciences
JF - Journal of Chromatography B: Analytical Technologies in the Biomedical and Life Sciences
ER -