Related papers: An AI Framework for Microanastomosis Motion Assessment

An AI Framework for Microanastomosis Motion Assessment

URL: http://arxiv.org/abs/2601.21120v1
Date: Wed, 28 Jan 2026 23:23:37 GMT
Title: An AI Framework for Microanastomosis Motion Assessment
Authors: Yan Meng, Eduardo J. Torres-Rodríguez, Marcelle Altshuler, Nishanth Gowda, Arhum Naeem, Recai Yilmaz, Omar Arnaout, Daniel A. Donoho,
Abstract summary: We propose a novel AI framework for the automated assessment of microanastomosis instrument handling skills.<n>The system integrates four core components: (1) an instrument detection module based on the You Only Look Once (YOLO) architecture; (2) an instrument tracking module developed from Deep Simple Online and Realtime Tracking (DeepSORT); and (3) an instrument tip localization module employing shape descriptors.<n> Experimental results demonstrate the effectiveness of the framework, achieving an instrument detection precision of 97%, with a mean Average Precision (mAP) of 96%, measured by Intersection over Union (IoU) thresholds ranging from 50% to 95% (m
Score: 3.9524886416531753
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Proficiency in microanastomosis is a fundamental competency across multiple microsurgical disciplines. These procedures demand exceptional precision and refined technical skills, making effective, standardized assessment methods essential. Traditionally, the evaluation of microsurgical techniques has relied heavily on the subjective judgment of expert raters. They are inherently constrained by limitations such as inter-rater variability, lack of standardized evaluation criteria, susceptibility to cognitive bias, and the time-intensive nature of manual review. These shortcomings underscore the urgent need for an objective, reliable, and automated system capable of assessing microsurgical performance with consistency and scalability. To bridge this gap, we propose a novel AI framework for the automated assessment of microanastomosis instrument handling skills. The system integrates four core components: (1) an instrument detection module based on the You Only Look Once (YOLO) architecture; (2) an instrument tracking module developed from Deep Simple Online and Realtime Tracking (DeepSORT); (3) an instrument tip localization module employing shape descriptors; and (4) a supervised classification module trained on expert-labeled data to evaluate instrument handling proficiency. Experimental results demonstrate the effectiveness of the framework, achieving an instrument detection precision of 97%, with a mean Average Precision (mAP) of 96%, measured by Intersection over Union (IoU) thresholds ranging from 50% to 95% (mAP50-95).

Related papers

Automated Machine Learning in Radiomics: A Comparative Evaluation of Performance, Efficiency and Accessibility [1.0631763329108546]
This study evaluates the performance, efficiency, and accessibility of general-purpose and radiomics-specific AutoML frameworks.<n>Ten public/private radiomics with varied imaging modalities (CT/MRI), sizes, anatomies and endpoints were used.<n>Simplatab provides an effective balance of performance, efficiency, and accessibility for radiomics classification problems.
arXiv Detail & Related papers (2026-01-13T08:47:44Z)
AI-Driven Evaluation of Surgical Skill via Action Recognition [4.92174988745803]
We propose an AI-driven framework for the automated assessment of microanastomosis performance.<n>Performance is evaluated along five aspects of microanastomosis skill, including overall action execution, motion quality during procedure-critical actions, and general instrument handling.
arXiv Detail & Related papers (2025-12-30T18:45:34Z)
Kinematic-Based Assessment of Surgical Actions in Microanastomosis [4.92174988745803]
We introduce an AI-driven framework for automated action segmentation and performance assessment in microanastomosis procedures.<n>A dataset of 58 expert-rated microanastomosis videos demonstrates the effectiveness of our approach.
arXiv Detail & Related papers (2025-12-30T02:18:49Z)
Rethinking Evaluation of Infrared Small Target Detection [105.59753496831739]
This paper introduces a hybrid-level metric incorporating pixel- and target-level performance, proposing a systematic error analysis method, and emphasizing the importance of cross-dataset evaluation.<n>An open-source toolkit has be released to facilitate standardized benchmarking.
arXiv Detail & Related papers (2025-09-21T02:45:07Z)
Microsurgical Instrument Segmentation for Robot-Assisted Surgery [3.880707330499936]
We propose a segmentation framework that augments RGB input with luminance channels, integrates skip attention to preserve elongated features, and employs an Iterative Feedback Module(IFM) for continuity restoration.<n>Experiments demonstrate that MISRA achieves competitive performance, improving the mean class IoU by 5.37% over competing methods.<n>These results position MISRA as a promising step toward reliable scene parsing for computer-assisted and robotic microsurgery.
arXiv Detail & Related papers (2025-09-15T09:29:27Z)
Quantitative Outcome-Oriented Assessment of Microsurgical Anastomosis [7.432334662327386]
We introduce a quantitative framework that uses image-processing techniques for objective assessment of microsurgical anastomoses.<n>The approach uses geometric modeling of errors along with a detection and scoring mechanism.<n>The results show that the geometric metrics effectively replicate expert raters' scoring for the errors considered in this work.
arXiv Detail & Related papers (2025-08-26T09:14:31Z)
Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases [48.87360916431396]
We introduce MedR-Bench, a benchmarking dataset of 1,453 structured patient cases, annotated with reasoning references.<n>We propose a framework encompassing three critical examination recommendation, diagnostic decision-making, and treatment planning, simulating the entire patient care journey.<n>Using this benchmark, we evaluate five state-of-the-art reasoning LLMs, including DeepSeek-R1, OpenAI-o3-mini, and Gemini-2.0-Flash Thinking, etc.
arXiv Detail & Related papers (2025-03-06T18:35:39Z)
Benchmarks as Microscopes: A Call for Model Metrology [76.64402390208576]
Modern language models (LMs) pose a new challenge in capability assessment. To be confident in our metrics, we need a new discipline of model metrology.
arXiv Detail & Related papers (2024-07-22T17:52:12Z)
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology. For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z)
Detecting Shortcut Learning for Fair Medical AI using Shortcut Testing [62.9062883851246]
Machine learning holds great promise for improving healthcare, but it is critical to ensure that its use will not propagate or amplify health disparities. One potential driver of algorithmic unfairness, shortcut learning, arises when ML models base predictions on improper correlations in the training data. Using multi-task learning, we propose the first method to assess and mitigate shortcut learning as a part of the fairness assessment of clinical ML systems.
arXiv Detail & Related papers (2022-07-21T09:35:38Z)
Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach [84.02388020258141]
We propose a new framework named ENIGMA for estimating human evaluation scores based on off-policy evaluation in reinforcement learning. ENIGMA only requires a handful of pre-collected experience data, and therefore does not involve human interaction with the target policy during the evaluation. Our experiments show that ENIGMA significantly outperforms existing methods in terms of correlation with human evaluation scores.
arXiv Detail & Related papers (2021-02-20T03:29:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.