Related papers: Multi-Modal Self-Supervised Learning for Surgical Feedback Effectiveness Assessment

Multi-Modal Self-Supervised Learning for Surgical Feedback Effectiveness Assessment

URL: http://arxiv.org/abs/2411.10919v1
Date: Sun, 17 Nov 2024 00:13:00 GMT
Title: Multi-Modal Self-Supervised Learning for Surgical Feedback Effectiveness Assessment
Authors: Arushi Gupta, Rafal Kocielnik, Jiayun Wang, Firdavs Nasriddinov, Cherine Yang, Elyssa Wong, Anima Anandkumar, Andrew Hung,
Abstract summary: We propose a method that integrates information from transcribed verbal feedback and corresponding surgical video to predict feedback effectiveness. Our findings show that both transcribed feedback and surgical video are individually predictive of trainee behavior changes. Our results demonstrate the potential of multi-modal learning to advance the automated assessment of surgical feedback.
Score: 66.6041949490137
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: During surgical training, real-time feedback from trainers to trainees is important for preventing errors and enhancing long-term skill acquisition. Accurately predicting the effectiveness of this feedback, specifically whether it leads to a change in trainee behavior, is crucial for developing methods for improving surgical training and education. However, relying on human annotations to assess feedback effectiveness is laborious and prone to biases, underscoring the need for an automated, scalable, and objective method. Creating such an automated system poses challenges, as it requires an understanding of both the verbal feedback delivered by the trainer and the visual context of the real-time surgical scene. To address this, we propose a method that integrates information from transcribed verbal feedback and corresponding surgical video to predict feedback effectiveness. Our findings show that both transcribed feedback and surgical video are individually predictive of trainee behavior changes, and their combination achieves an AUROC of 0.70+/-0.02, improving prediction accuracy by up to 6.6%. Additionally, we introduce self-supervised fine-tuning as a strategy for enhancing surgical video representation learning, which is scalable and further enhances prediction performance. Our results demonstrate the potential of multi-modal learning to advance the automated assessment of surgical feedback.

Related papers

Explainable AI for Automated User-specific Feedback in Surgical Skill Acquisition [38.38538970682482]
We examine the effectiveness of explainable AI (XAI)-generated feedback in surgical training through a human-AI study.<n>We compare the impact of XAI-guided feedback against traditional video-based coaching on task outcomes, cognitive load, and trainees' perceptions of AI-assisted learning.
arXiv Detail & Related papers (2025-08-04T16:48:44Z)
An Automated Machine Learning Framework for Surgical Suturing Action Detection under Class Imbalance [1.2043621020930133]
Real-time detection of surgical actions with interpretable outputs is crucial for automated and real-time instructional feedback and skill development. This paper presents a rapid deployment approach utilizing automated machine learning methods, based on surgical action data collected from both experienced and trainee surgeons.
arXiv Detail & Related papers (2025-02-10T12:47:36Z)
Automating Feedback Analysis in Surgical Training: Detection, Categorization, and Assessment [65.70317151363204]
This work introduces the first framework for reconstructing surgical dialogue from unstructured real-world recordings. In surgical training, the formative verbal feedback that trainers provide to trainees during live surgeries is crucial for ensuring safety, correcting behavior immediately, and facilitating long-term skill acquisition. Our framework integrates voice activity detection, speaker diarization, and automated speech recaognition, with a novel enhancement that removes hallucinations.
arXiv Detail & Related papers (2024-12-01T10:35:12Z)
Video-based Surgical Skill Assessment using Tree-based Gaussian Process Classifier [2.3964255330849356]
This paper presents a novel pipeline for automated surgical skill assessment using video data. The pipeline incorporates a representation flow convolutional neural network and a novel tree-based Gaussian process classifier. The proposed method has the potential to facilitate skill improvement among surgery fellows and enhance patient safety.
arXiv Detail & Related papers (2023-12-15T21:06:22Z)
Deep Multimodal Fusion for Surgical Feedback Classification [70.53297887843802]
We leverage a clinically-validated five-category classification of surgical feedback. We then develop a multi-label machine learning model to classify these five categories of surgical feedback from inputs of text, audio, and video modalities. The ultimate goal of our work is to help automate the annotation of real-time contextual surgical feedback at scale.
arXiv Detail & Related papers (2023-12-06T01:59:47Z)
Design, Development, and Evaluation of an Interactive Personalized Social Robot to Monitor and Coach Post-Stroke Rehabilitation Exercises [68.37238218842089]
We develop an interactive social robot exercise coaching system for personalized rehabilitation. This system integrates a neural network model with a rule-based model to automatically monitor and assess patients' rehabilitation exercises. Our system can adapt to new participants and achieved 0.81 average performance to assess their exercises, which is comparable to the experts' agreement level.
arXiv Detail & Related papers (2023-05-12T17:37:04Z)
Automated Fidelity Assessment for Strategy Training in Inpatient Rehabilitation using Natural Language Processing [53.096237570992294]
Strategy training is a rehabilitation approach that teaches skills to reduce disability among those with cognitive impairments following a stroke. Standardized fidelity assessment is used to measure adherence to treatment principles. We developed a rule-based NLP algorithm, a long-short term memory (LSTM) model, and a bidirectional encoder representation from transformers (BERT) model for this task.
arXiv Detail & Related papers (2022-09-14T15:33:30Z)
Video-based Surgical Skills Assessment using Long term Tool Tracking [0.3324986723090368]
We introduce a motion-based approach to automatically assess surgical skills from surgical case video feed. The proposed pipeline first tracks surgical tools reliably to create motion trajectories. We compare transformer-based skill assessment with traditional machine learning approaches using the proposed and state-of-the-art tracking.
arXiv Detail & Related papers (2022-07-05T18:15:28Z)
Video-based Formative and Summative Assessment of Surgical Tasks using Deep Learning [0.8612287536028312]
We propose a deep learning (DL) model that can automatically and objectively provide a high-stakes summative assessment of surgical skill execution. Formative assessment is generated using heatmaps of visual features that correlate with surgical performance.
arXiv Detail & Related papers (2022-03-17T20:07:48Z)
Opportunities of a Machine Learning-based Decision Support System for Stroke Rehabilitation Assessment [64.52563354823711]
Rehabilitation assessment is critical to determine an adequate intervention for a patient. Current practices of assessment mainly rely on therapist's experience, and assessment is infrequently executed due to the limited availability of a therapist. We developed an intelligent decision support system that can identify salient features of assessment using reinforcement learning.
arXiv Detail & Related papers (2020-02-27T17:04:07Z)
Facial Feedback for Reinforcement Learning: A Case Study and Offline Analysis Using the TAMER Framework [51.237191651923666]
We investigate the potential of agent learning from trainers' facial expressions via interpreting them as evaluative feedback. With designed CNN-RNN model, our analysis shows that telling trainers to use facial expressions and competition can improve the accuracies for estimating positive and negative feedback. Our results with a simulation experiment show that learning solely from predicted feedback based on facial expressions is possible.
arXiv Detail & Related papers (2020-01-23T17:50:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.