Automatic inverse treatment planning of Gamma Knife radiosurgery via deep reinforcement learning

Yingzi Liu; Chenyang Shen; Tonghe Wang; Jiahan Zhang; Xiaofeng Yang; Tian Liu; Shannon Kahn; Hui Kuo Shu; Zhen Tian

doi:10.1002/mp.15576

Automatic inverse treatment planning of Gamma Knife radiosurgery via deep reinforcement learning

Yingzi Liu, Chenyang Shen, Tonghe Wang, Jiahan Zhang, Xiaofeng Yang, Tian Liu, Shannon Kahn, Hui Kuo Shu, Zhen Tian

Research output: Contribution to journal › Article › peer-review

2 Scopus citations

Abstract

Purpose: Several inverse planning algorithms have been developed for Gamma Knife (GK) radiosurgery to determine a large number of plan parameters by solving an optimization problem, which typically consists of multiple objectives. The priorities among these objectives need to be repetitively adjusted to achieve a clinically good plan for each patient. This study aimed to achieve automatic and intelligent priority tuning by developing a deep reinforcement learning (DRL)-based method to model the tuning behaviors of human planners. Methods: We built a priority-tuning policy network using deep convolutional neural networks. Its input was a vector composed of multiple plan metrics that were used in our institution for GK plan evaluation. The network can determine which tuning action to take based on the observed quality of the intermediate plan. We trained the network using an end-to-end DRL framework to approximate the optimal action-value function. A scoring function was designed to measure the plan quality to calculate the received reward of a tuning action. Results: Vestibular schwannoma was chosen as the test bed in this study. The number of training, validation and testing cases were 5, 5, and 16, respectively. For these three datasets, the average scores of the initial plans obtained with the same initial priority set were 3.63 ± 1.34, 3.83 ± 0.86 and 4.20 ± 0.78, respectively, while they were improved to 5.28 ± 0.23, 4.97 ± 0.44 and 5.22 ± 0.26 through manual priority tuning by human expert planners. Our network achieved competitive results with 5.42 ± 0.11, 5.10 ± 0. 42, 5.28 ± 0.20, respectively. Conclusions: Our network can generate GK plans of comparable or slightly higher quality than the plans generated by human planners via manual priority tuning for vestibular schwannoma cases. The network can potentially be incorporated into the clinical workflow as planning assistance to improve GK planning efficiency and help to reduce plan quality variation caused by interplanner variability. We also hope that our method can reduce the workload of GK planners and allow them to spend more time on more challenging cases.

Original language	English (US)
Pages (from-to)	2877-2889
Number of pages	13
Journal	Medical physics
Volume	49
Issue number	5
DOIs	https://doi.org/10.1002/mp.15576
State	Published - May 2022

Keywords

Gamma Knife radiosurgery
automatic inverse treatment planning
automatic priority tuning
deep reinforcement learning

ASJC Scopus subject areas

Biophysics
Radiology Nuclear Medicine and imaging

Access to Document

10.1002/mp.15576

Cite this

@article{21a47d13d0704c77a0b3a3b54b074a74,

title = "Automatic inverse treatment planning of Gamma Knife radiosurgery via deep reinforcement learning",

abstract = "Purpose: Several inverse planning algorithms have been developed for Gamma Knife (GK) radiosurgery to determine a large number of plan parameters by solving an optimization problem, which typically consists of multiple objectives. The priorities among these objectives need to be repetitively adjusted to achieve a clinically good plan for each patient. This study aimed to achieve automatic and intelligent priority tuning by developing a deep reinforcement learning (DRL)-based method to model the tuning behaviors of human planners. Methods: We built a priority-tuning policy network using deep convolutional neural networks. Its input was a vector composed of multiple plan metrics that were used in our institution for GK plan evaluation. The network can determine which tuning action to take based on the observed quality of the intermediate plan. We trained the network using an end-to-end DRL framework to approximate the optimal action-value function. A scoring function was designed to measure the plan quality to calculate the received reward of a tuning action. Results: Vestibular schwannoma was chosen as the test bed in this study. The number of training, validation and testing cases were 5, 5, and 16, respectively. For these three datasets, the average scores of the initial plans obtained with the same initial priority set were 3.63 ± 1.34, 3.83 ± 0.86 and 4.20 ± 0.78, respectively, while they were improved to 5.28 ± 0.23, 4.97 ± 0.44 and 5.22 ± 0.26 through manual priority tuning by human expert planners. Our network achieved competitive results with 5.42 ± 0.11, 5.10 ± 0. 42, 5.28 ± 0.20, respectively. Conclusions: Our network can generate GK plans of comparable or slightly higher quality than the plans generated by human planners via manual priority tuning for vestibular schwannoma cases. The network can potentially be incorporated into the clinical workflow as planning assistance to improve GK planning efficiency and help to reduce plan quality variation caused by interplanner variability. We also hope that our method can reduce the workload of GK planners and allow them to spend more time on more challenging cases.",

keywords = "Gamma Knife radiosurgery, automatic inverse treatment planning, automatic priority tuning, deep reinforcement learning",

author = "Yingzi Liu and Chenyang Shen and Tonghe Wang and Jiahan Zhang and Xiaofeng Yang and Tian Liu and Shannon Kahn and Shu, {Hui Kuo} and Zhen Tian",

note = "Publisher Copyright: {\textcopyright} 2022 American Association of Physicists in Medicine.",

year = "2022",

month = may,

doi = "10.1002/mp.15576",

language = "English (US)",

volume = "49",

pages = "2877--2889",

journal = "Medical physics",

issn = "0094-2405",

publisher = "AAPM - American Association of Physicists in Medicine",

number = "5",

}

TY - JOUR

T1 - Automatic inverse treatment planning of Gamma Knife radiosurgery via deep reinforcement learning

AU - Liu, Yingzi

AU - Shen, Chenyang

AU - Wang, Tonghe

AU - Zhang, Jiahan

AU - Yang, Xiaofeng

AU - Liu, Tian

AU - Kahn, Shannon

AU - Shu, Hui Kuo

AU - Tian, Zhen

PY - 2022/5

Y1 - 2022/5

N2 - Purpose: Several inverse planning algorithms have been developed for Gamma Knife (GK) radiosurgery to determine a large number of plan parameters by solving an optimization problem, which typically consists of multiple objectives. The priorities among these objectives need to be repetitively adjusted to achieve a clinically good plan for each patient. This study aimed to achieve automatic and intelligent priority tuning by developing a deep reinforcement learning (DRL)-based method to model the tuning behaviors of human planners. Methods: We built a priority-tuning policy network using deep convolutional neural networks. Its input was a vector composed of multiple plan metrics that were used in our institution for GK plan evaluation. The network can determine which tuning action to take based on the observed quality of the intermediate plan. We trained the network using an end-to-end DRL framework to approximate the optimal action-value function. A scoring function was designed to measure the plan quality to calculate the received reward of a tuning action. Results: Vestibular schwannoma was chosen as the test bed in this study. The number of training, validation and testing cases were 5, 5, and 16, respectively. For these three datasets, the average scores of the initial plans obtained with the same initial priority set were 3.63 ± 1.34, 3.83 ± 0.86 and 4.20 ± 0.78, respectively, while they were improved to 5.28 ± 0.23, 4.97 ± 0.44 and 5.22 ± 0.26 through manual priority tuning by human expert planners. Our network achieved competitive results with 5.42 ± 0.11, 5.10 ± 0. 42, 5.28 ± 0.20, respectively. Conclusions: Our network can generate GK plans of comparable or slightly higher quality than the plans generated by human planners via manual priority tuning for vestibular schwannoma cases. The network can potentially be incorporated into the clinical workflow as planning assistance to improve GK planning efficiency and help to reduce plan quality variation caused by interplanner variability. We also hope that our method can reduce the workload of GK planners and allow them to spend more time on more challenging cases.

AB - Purpose: Several inverse planning algorithms have been developed for Gamma Knife (GK) radiosurgery to determine a large number of plan parameters by solving an optimization problem, which typically consists of multiple objectives. The priorities among these objectives need to be repetitively adjusted to achieve a clinically good plan for each patient. This study aimed to achieve automatic and intelligent priority tuning by developing a deep reinforcement learning (DRL)-based method to model the tuning behaviors of human planners. Methods: We built a priority-tuning policy network using deep convolutional neural networks. Its input was a vector composed of multiple plan metrics that were used in our institution for GK plan evaluation. The network can determine which tuning action to take based on the observed quality of the intermediate plan. We trained the network using an end-to-end DRL framework to approximate the optimal action-value function. A scoring function was designed to measure the plan quality to calculate the received reward of a tuning action. Results: Vestibular schwannoma was chosen as the test bed in this study. The number of training, validation and testing cases were 5, 5, and 16, respectively. For these three datasets, the average scores of the initial plans obtained with the same initial priority set were 3.63 ± 1.34, 3.83 ± 0.86 and 4.20 ± 0.78, respectively, while they were improved to 5.28 ± 0.23, 4.97 ± 0.44 and 5.22 ± 0.26 through manual priority tuning by human expert planners. Our network achieved competitive results with 5.42 ± 0.11, 5.10 ± 0. 42, 5.28 ± 0.20, respectively. Conclusions: Our network can generate GK plans of comparable or slightly higher quality than the plans generated by human planners via manual priority tuning for vestibular schwannoma cases. The network can potentially be incorporated into the clinical workflow as planning assistance to improve GK planning efficiency and help to reduce plan quality variation caused by interplanner variability. We also hope that our method can reduce the workload of GK planners and allow them to spend more time on more challenging cases.

KW - Gamma Knife radiosurgery

KW - automatic inverse treatment planning

KW - automatic priority tuning

KW - deep reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85127266509&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85127266509&partnerID=8YFLogxK

U2 - 10.1002/mp.15576

DO - 10.1002/mp.15576

M3 - Article

C2 - 35213936

AN - SCOPUS:85127266509

SN - 0094-2405

VL - 49

SP - 2877

EP - 2889

JO - Medical physics

JF - Medical physics

IS - 5

ER -

Automatic inverse treatment planning of Gamma Knife radiosurgery via deep reinforcement learning

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this