TY - JOUR
T1 - Identification of breast cancer prognosis markers via integrative analysis
AU - Ma, Shuangge
AU - Dai, Ying
AU - Huang, Jian
AU - Xie, Yang
N1 - Funding Information:
The authors would like to thank the associate editor and a referee for careful review and insightful comments. This study has been supported by awards CA120988 , CA152301 and CA142774 from NIH and DMS-0904181 from NSF .
PY - 2012/9
Y1 - 2012/9
N2 - In breast cancer research, it is of great interest to identify genomic markers associated with prognosis. Multiple gene profiling studies have been conducted for such a purpose. Genomic markers identified from the analysis of single datasets often do not have satisfactory reproducibility. Among the multiple possible reasons, the most important one is the small sample sizes of individual studies. A cost-effective solution is to pool data from multiple comparable studies and conduct integrative analysis. In this study, we collect four breast cancer prognosis studies with gene expression measurements. We describe the relationship between prognosis and gene expressions using the accelerated failure time (AFT) models. We adopt a 2-norm group bridge penalization approach for marker identification. This integrative analysis approach can effectively identify markers with consistent effects across multiple datasets and naturally accommodate the heterogeneity among studies. Statistical and simulation studies demonstrate satisfactory performance of this approach. Breast cancer prognosis markers identified using this approach have sound biological implications and satisfactory prediction performance.
AB - In breast cancer research, it is of great interest to identify genomic markers associated with prognosis. Multiple gene profiling studies have been conducted for such a purpose. Genomic markers identified from the analysis of single datasets often do not have satisfactory reproducibility. Among the multiple possible reasons, the most important one is the small sample sizes of individual studies. A cost-effective solution is to pool data from multiple comparable studies and conduct integrative analysis. In this study, we collect four breast cancer prognosis studies with gene expression measurements. We describe the relationship between prognosis and gene expressions using the accelerated failure time (AFT) models. We adopt a 2-norm group bridge penalization approach for marker identification. This integrative analysis approach can effectively identify markers with consistent effects across multiple datasets and naturally accommodate the heterogeneity among studies. Statistical and simulation studies demonstrate satisfactory performance of this approach. Breast cancer prognosis markers identified using this approach have sound biological implications and satisfactory prediction performance.
KW - 2-norm group bridge
KW - Breast cancer prognosis
KW - Gene expression
KW - Integrative analysis
KW - Marker identification
UR - http://www.scopus.com/inward/record.url?scp=84862804849&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862804849&partnerID=8YFLogxK
U2 - 10.1016/j.csda.2012.02.017
DO - 10.1016/j.csda.2012.02.017
M3 - Article
C2 - 22773869
AN - SCOPUS:84862804849
SN - 0167-9473
VL - 56
SP - 2718
EP - 2728
JO - Computational Statistics and Data Analysis
JF - Computational Statistics and Data Analysis
IS - 9
ER -