AI-Driven Evaluation of Surgical Skill via Action Recognition
- URL: http://arxiv.org/abs/2512.24411v1
- Date: Tue, 30 Dec 2025 18:45:34 GMT
- Title: AI-Driven Evaluation of Surgical Skill via Action Recognition
- Authors: Yan Meng, Daniel A. Donoho, Marcelle Altshuler, Omar Arnaout,
- Abstract summary: We propose an AI-driven framework for the automated assessment of microanastomosis performance.<n>Performance is evaluated along five aspects of microanastomosis skill, including overall action execution, motion quality during procedure-critical actions, and general instrument handling.
- Score: 4.92174988745803
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The development of effective training and evaluation strategies is critical. Conventional methods for assessing surgical proficiency typically rely on expert supervision, either through onsite observation or retrospective analysis of recorded procedures. However, these approaches are inherently subjective, susceptible to inter-rater variability, and require substantial time and effort from expert surgeons. These demands are often impractical in low- and middle-income countries, thereby limiting the scalability and consistency of such methods across training programs. To address these limitations, we propose a novel AI-driven framework for the automated assessment of microanastomosis performance. The system integrates a video transformer architecture based on TimeSformer, improved with hierarchical temporal attention and weighted spatial attention mechanisms, to achieve accurate action recognition within surgical videos. Fine-grained motion features are then extracted using a YOLO-based object detection and tracking method, allowing for detailed analysis of instrument kinematics. Performance is evaluated along five aspects of microanastomosis skill, including overall action execution, motion quality during procedure-critical actions, and general instrument handling. Experimental validation using a dataset of 58 expert-annotated videos demonstrates the effectiveness of the system, achieving 87.7% frame-level accuracy in action segmentation that increased to 93.62% with post-processing, and an average classification accuracy of 76% in replicating expert assessments across all skill aspects. These findings highlight the system's potential to provide objective, consistent, and interpretable feedback, thereby enabling more standardized, data-driven training and evaluation in surgical education.
Related papers
- An AI Framework for Microanastomosis Motion Assessment [3.9524886416531753]
We propose a novel AI framework for the automated assessment of microanastomosis instrument handling skills.<n>The system integrates four core components: (1) an instrument detection module based on the You Only Look Once (YOLO) architecture; (2) an instrument tracking module developed from Deep Simple Online and Realtime Tracking (DeepSORT); and (3) an instrument tip localization module employing shape descriptors.<n> Experimental results demonstrate the effectiveness of the framework, achieving an instrument detection precision of 97%, with a mean Average Precision (mAP) of 96%, measured by Intersection over Union (IoU) thresholds ranging from 50% to 95% (m
arXiv Detail & Related papers (2026-01-28T23:23:37Z) - Kinematic-Based Assessment of Surgical Actions in Microanastomosis [4.92174988745803]
We introduce an AI-driven framework for automated action segmentation and performance assessment in microanastomosis procedures.<n>A dataset of 58 expert-rated microanastomosis videos demonstrates the effectiveness of our approach.
arXiv Detail & Related papers (2025-12-30T02:18:49Z) - Quantitative Outcome-Oriented Assessment of Microsurgical Anastomosis [7.432334662327386]
We introduce a quantitative framework that uses image-processing techniques for objective assessment of microsurgical anastomoses.<n>The approach uses geometric modeling of errors along with a detection and scoring mechanism.<n>The results show that the geometric metrics effectively replicate expert raters' scoring for the errors considered in this work.
arXiv Detail & Related papers (2025-08-26T09:14:31Z) - An Automated Machine Learning Framework for Surgical Suturing Action Detection under Class Imbalance [1.2043621020930133]
Real-time detection of surgical actions with interpretable outputs is crucial for automated and real-time instructional feedback and skill development.<n>This paper presents a rapid deployment approach utilizing automated machine learning methods, based on surgical action data collected from both experienced and trainee surgeons.
arXiv Detail & Related papers (2025-02-10T12:47:36Z) - Automating Feedback Analysis in Surgical Training: Detection, Categorization, and Assessment [65.70317151363204]
This work introduces the first framework for reconstructing surgical dialogue from unstructured real-world recordings.<n>In surgical training, the formative verbal feedback that trainers provide to trainees during live surgeries is crucial for ensuring safety, correcting behavior immediately, and facilitating long-term skill acquisition.<n>Our framework integrates voice activity detection, speaker diarization, and automated speech recaognition, with a novel enhancement that removes hallucinations.
arXiv Detail & Related papers (2024-12-01T10:35:12Z) - Multi-Modal Self-Supervised Learning for Surgical Feedback Effectiveness Assessment [66.6041949490137]
We propose a method that integrates information from transcribed verbal feedback and corresponding surgical video to predict feedback effectiveness.
Our findings show that both transcribed feedback and surgical video are individually predictive of trainee behavior changes.
Our results demonstrate the potential of multi-modal learning to advance the automated assessment of surgical feedback.
arXiv Detail & Related papers (2024-11-17T00:13:00Z) - Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community.
The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored.
We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z) - Video-based Formative and Summative Assessment of Surgical Tasks using
Deep Learning [0.8612287536028312]
We propose a deep learning (DL) model that can automatically and objectively provide a high-stakes summative assessment of surgical skill execution.
Formative assessment is generated using heatmaps of visual features that correlate with surgical performance.
arXiv Detail & Related papers (2022-03-17T20:07:48Z) - One-shot action recognition towards novel assistive therapies [63.23654147345168]
This work is motivated by the automated analysis of medical therapies that involve action imitation games.
The presented approach incorporates a pre-processing step that standardizes heterogeneous motion data conditions.
We evaluate the approach on a real use-case of automated video analysis for therapy support with autistic people.
arXiv Detail & Related papers (2021-02-17T19:41:37Z) - A Review of Computational Approaches for Evaluation of Rehabilitation
Exercises [58.720142291102135]
This paper reviews computational approaches for evaluating patient performance in rehabilitation programs using motion capture systems.
The reviewed computational methods for exercise evaluation are grouped into three main categories: discrete movement score, rule-based, and template-based approaches.
arXiv Detail & Related papers (2020-02-29T22:18:56Z) - Interpretable Off-Policy Evaluation in Reinforcement Learning by
Highlighting Influential Transitions [48.91284724066349]
Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education.
Traditional measures such as confidence intervals may be insufficient due to noise, limited data and confounding.
We develop a method that could serve as a hybrid human-AI system, to enable human experts to analyze the validity of policy evaluation estimates.
arXiv Detail & Related papers (2020-02-10T00:26:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.