TY - JOUR
T1 - Vision Transformer-Based Multilabel Survival Prediction for Oropharynx Cancer After Radiation Therapy
AU - Chen, Meixu
AU - Wang, Kai
AU - Wang, Jing
N1 - Publisher Copyright:
© 2023 Elsevier Inc.
PY - 2024/3/15
Y1 - 2024/3/15
N2 - Purpose: A reliable and comprehensive cancer prognosis model for oropharyngeal cancer (OPC) could better assist in personalizing treatment. In this work, we developed a vision transformer-based (ViT-based) multilabel model with multimodal input to learn complementary information from available pretreatment data and predict multiple associated endpoints for radiation therapy for patients with OPC. Methods and Materials: A publicly available data set of 512 patients with OPC was used for both model training and evaluation. Planning computed tomography images, primary gross tumor volume masks, and 16 clinical variables representing patient demographics, diagnosis, and treatment were used as inputs. To extract deep image features with global attention, we used a ViT module. Clinical variables were concatenated with the learned image features and fed into fully connected layers to incorporate cross-modality features. To learn the mapping between the features and correlated survival outcomes, including overall survival, local failure-free survival, regional failure-free survival, and distant failure-free survival, we employed 4 multitask logistic regression layers. The proposed model was optimized by combining the multitask logistic regression negative-log likelihood losses of different prediction targets. Results: We employed the C-index and area under the curve metrics to assess the performance of our model for time-to-event prediction and time-specific binary prediction, respectively. Our proposed model outperformed corresponding single-modality and single-label models on all prediction labels, achieving C-indices of 0.773, 0.765, 0.776, and 0.773 for overall survival, local failure-free survival, regional failure-free survival, and distant failure-free survival, respectively. The area under the curve values ranged between 0.799 and 0.844 for different tasks at different time points. Using the medians of predicted risks as the thresholds to identify high-risk and low-risk patient groups, we performed the log-rank test, the results of which showed significantly larger separations in different event-free survivals. Conclusion: We developed the first model capable of predicting multiple labels for OPC simultaneously. Our model demonstrated better prognostic ability for all the prediction targets compared with corresponding single-modality models and single-label models.
AB - Purpose: A reliable and comprehensive cancer prognosis model for oropharyngeal cancer (OPC) could better assist in personalizing treatment. In this work, we developed a vision transformer-based (ViT-based) multilabel model with multimodal input to learn complementary information from available pretreatment data and predict multiple associated endpoints for radiation therapy for patients with OPC. Methods and Materials: A publicly available data set of 512 patients with OPC was used for both model training and evaluation. Planning computed tomography images, primary gross tumor volume masks, and 16 clinical variables representing patient demographics, diagnosis, and treatment were used as inputs. To extract deep image features with global attention, we used a ViT module. Clinical variables were concatenated with the learned image features and fed into fully connected layers to incorporate cross-modality features. To learn the mapping between the features and correlated survival outcomes, including overall survival, local failure-free survival, regional failure-free survival, and distant failure-free survival, we employed 4 multitask logistic regression layers. The proposed model was optimized by combining the multitask logistic regression negative-log likelihood losses of different prediction targets. Results: We employed the C-index and area under the curve metrics to assess the performance of our model for time-to-event prediction and time-specific binary prediction, respectively. Our proposed model outperformed corresponding single-modality and single-label models on all prediction labels, achieving C-indices of 0.773, 0.765, 0.776, and 0.773 for overall survival, local failure-free survival, regional failure-free survival, and distant failure-free survival, respectively. The area under the curve values ranged between 0.799 and 0.844 for different tasks at different time points. Using the medians of predicted risks as the thresholds to identify high-risk and low-risk patient groups, we performed the log-rank test, the results of which showed significantly larger separations in different event-free survivals. Conclusion: We developed the first model capable of predicting multiple labels for OPC simultaneously. Our model demonstrated better prognostic ability for all the prediction targets compared with corresponding single-modality models and single-label models.
UR - http://www.scopus.com/inward/record.url?scp=85180336481&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85180336481&partnerID=8YFLogxK
U2 - 10.1016/j.ijrobp.2023.10.022
DO - 10.1016/j.ijrobp.2023.10.022
M3 - Article
C2 - 37939732
AN - SCOPUS:85180336481
SN - 0360-3016
VL - 118
SP - 1123
EP - 1134
JO - International Journal of Radiation Oncology Biology Physics
JF - International Journal of Radiation Oncology Biology Physics
IS - 4
ER -