TY - GEN
T1 - Towards Better Debriefing through Context-aware Video Segmentation in Standardized Patient Encounter Ear Exams
AU - Vedovato, Sol
AU - Kang, Shinyoung
AU - Holcomb, Michael J.
AU - Campbell, Krystle K.
AU - Scott, Daniel J.
AU - Dalton, Thomas O.
AU - Danuser, Gaudenz
AU - Jamieson, Andrew R.
N1 - Publisher Copyright:
©2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Evaluating medical students’ physical skills is vital for assessing and validating the acquisition and utilization of their medical knowledge. However, the conventional method of human-based grading for standardized patient encounter videos is costly and error-prone, requires the participation of experts, and may suffer from inter-rater reliability issues and long processing times. Here we propose a deep learning pipeline to identify, extract and score ear exams in over a thousand Simulation Center COSCE medical encounter videos. Our three-stage approach consists of audio-based pre-segmentation with Whisper and Silero VAD, tool detection with CLIP and body node detection with Detectron2. The results of our pipeline are then compared to human graded output and used for automatic extraction of the most relevant video segments. This approach represents a first step toward our overarching goal of expediting and enhancing the quality of the debriefing process following standardized assessments.
AB - Evaluating medical students’ physical skills is vital for assessing and validating the acquisition and utilization of their medical knowledge. However, the conventional method of human-based grading for standardized patient encounter videos is costly and error-prone, requires the participation of experts, and may suffer from inter-rater reliability issues and long processing times. Here we propose a deep learning pipeline to identify, extract and score ear exams in over a thousand Simulation Center COSCE medical encounter videos. Our three-stage approach consists of audio-based pre-segmentation with Whisper and Silero VAD, tool detection with CLIP and body node detection with Detectron2. The results of our pipeline are then compared to human graded output and used for automatic extraction of the most relevant video segments. This approach represents a first step toward our overarching goal of expediting and enhancing the quality of the debriefing process following standardized assessments.
UR - http://www.scopus.com/inward/record.url?scp=85192274302&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85192274302&partnerID=8YFLogxK
U2 - 10.1109/AIMHC59811.2024.00036
DO - 10.1109/AIMHC59811.2024.00036
M3 - Conference contribution
AN - SCOPUS:85192274302
T3 - Proceedings - 2024 IEEE 1st International Conference on Artificial Intelligence for Medicine, Health and Care, AIMHC 2024
SP - 162
EP - 165
BT - Proceedings - 2024 IEEE 1st International Conference on Artificial Intelligence for Medicine, Health and Care, AIMHC 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1st IEEE International Conference on Artificial Intelligence for Medicine, Health and Care, AIMHC 2024
Y2 - 5 February 2024 through 7 February 2024
ER -