TY - GEN
T1 - Content-based retrieval of video segments from minimally invasive surgery videos using deep convolutional video descriptors and iterative query refinement
AU - Chittajallu, Deepak R.
AU - Basharat, Arslan
AU - Tunison, Paul
AU - Horvath, Samantha
AU - Wells, Katerina O.
AU - Leeds, Steven G.
AU - Fleshman, James W.
AU - Sankaranarayanan, Ganesh
AU - Enquobahrie, Andinet
N1 - Publisher Copyright:
© 2019 SPIE.
PY - 2019
Y1 - 2019
N2 - Despite a strong evidence of the clinical and economic benefits of minimally invasive surgery (MIS) for many common surgical procedures, there is a gross underutilization of MIS in many US hospitals, potentially due to its steep learning curve. Intraoperative videos captured using a camera inserted into the body during MIS procedures are emerging as an invaluable resource for MIS education, skill assessment and quality assurance. However, these videos often have a duration of several hours and there is a pressing need for automated tools to help surgeons quickly find key semantic segments of interest within MIS videos. In this paper, we present a novel integrated approach for facilitating content-based retrieval of video segments that are semantically similar to a query video within a large collection of MIS videos. We use state-of-theart deep 3D convolutional neural network (CNN) models pre-trained on large public video classification datasets to extract spatiotemporal features from MIS video segments and employ an iterative query refinement (IQR) strategy where in a support vector machine (SVM) classifier trained online based on relevance feedback from the user is used to refine the search results iteratively. We show that our method outperforms the state-of-the-art on the SurgicalActions160 dataset containing 160 video clips of typical surgical actions in gynecologic MIS procedures.
AB - Despite a strong evidence of the clinical and economic benefits of minimally invasive surgery (MIS) for many common surgical procedures, there is a gross underutilization of MIS in many US hospitals, potentially due to its steep learning curve. Intraoperative videos captured using a camera inserted into the body during MIS procedures are emerging as an invaluable resource for MIS education, skill assessment and quality assurance. However, these videos often have a duration of several hours and there is a pressing need for automated tools to help surgeons quickly find key semantic segments of interest within MIS videos. In this paper, we present a novel integrated approach for facilitating content-based retrieval of video segments that are semantically similar to a query video within a large collection of MIS videos. We use state-of-theart deep 3D convolutional neural network (CNN) models pre-trained on large public video classification datasets to extract spatiotemporal features from MIS video segments and employ an iterative query refinement (IQR) strategy where in a support vector machine (SVM) classifier trained online based on relevance feedback from the user is used to refine the search results iteratively. We show that our method outperforms the state-of-the-art on the SurgicalActions160 dataset containing 160 video clips of typical surgical actions in gynecologic MIS procedures.
UR - http://www.scopus.com/inward/record.url?scp=85068928837&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85068928837&partnerID=8YFLogxK
U2 - 10.1117/12.2509985
DO - 10.1117/12.2509985
M3 - Conference contribution
AN - SCOPUS:85068928837
T3 - Progress in Biomedical Optics and Imaging - Proceedings of SPIE
BT - Medical Imaging 2019
A2 - Fei, Baowei
A2 - Linte, Cristian A.
PB - SPIE
T2 - Medical Imaging 2019: Image-Guided Procedures, Robotic Interventions, and Modeling
Y2 - 17 February 2019 through 19 February 2019
ER -