AI-driven Automation of End-to-end Assessment of Suturing Expertise
- URL: http://arxiv.org/abs/2503.17391v1
- Date: Mon, 17 Mar 2025 21:28:02 GMT
- Title: AI-driven Automation of End-to-end Assessment of Suturing Expertise
- Authors: Atharva Deo, Nicholas Matsumoto, Sun Kim, Peter Wager, Randy G. Tsai, Aaron Denmark, Cherine Yang, Xi Li, Jay Moran, Miguel Hernandez, Andrew J. Hung,
- Abstract summary: We present an AI based approach to automate the End-to-end Assessment of Suturing Expertise (EASE)<n>EASE provides granular skills assessment related to suturing to provide trainees with an objective evaluation of their aptitude along with actionable insights.<n>The AI based approach solves this by enabling real-time score prediction with minimal resources during model inference.
- Score: 6.4885743283287
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an AI based approach to automate the End-to-end Assessment of Suturing Expertise (EASE), a suturing skills assessment tool that comprehensively defines criteria around relevant sub-skills.1 While EASE provides granular skills assessment related to suturing to provide trainees with an objective evaluation of their aptitude along with actionable insights, the scoring process is currently performed by human evaluators, which is time and resource consuming. The AI based approach solves this by enabling real-time score prediction with minimal resources during model inference. This enables the possibility of real-time feedback to the surgeons/trainees, potentially accelerating the learning process for the suturing task and mitigating critical errors during the surgery, improving patient outcomes. In this study, we focus on the following 7 EASE domains that come under 3 suturing phases: 1) Needle Handling: Number of Repositions, Needle Hold Depth, Needle Hold Ratio, and Needle Hold Angle; 2) Needle Driving: Driving Smoothness, and Wrist Rotation; 3) Needle Withdrawal: Wrist Rotation.
Related papers
- Explainable AI for Collaborative Assessment of 2D/3D Registration Quality [50.65650507103078]
We propose the first artificial intelligence framework trained specifically for 2D/3D registration quality verification.<n>Our explainable AI (XAI) approach aims to enhance informed decision-making for human operators.
arXiv Detail & Related papers (2025-07-23T15:28:57Z) - AutoMedEval: Harnessing Language Models for Automatic Medical Capability Evaluation [55.2739790399209]
We present AutoMedEval, an open-sourced automatic evaluation model with 13B parameters specifically engineered to measure the question-answering proficiency of medical LLMs.<n>The overarching objective of AutoMedEval is to assess the quality of responses produced by diverse models, aspiring to significantly reduce the dependence on human evaluation.
arXiv Detail & Related papers (2025-05-17T07:44:54Z) - End-to-End Deep Learning for Real-Time Neuroimaging-Based Assessment of Bimanual Motor Skills [1.710146779965826]
This study presents a novel end-to-end deep learning framework that processes raw fNIRS signals directly.
It achieved a mean classification accuracy of 93.9% (SD 4.4) and a generalization accuracy of 92.6% (SD 1.9) on unseen skill retention datasets.
arXiv Detail & Related papers (2025-03-21T22:56:54Z) - Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases [48.87360916431396]
We introduce MedR-Bench, a benchmarking dataset of 1,453 structured patient cases, annotated with reasoning references.<n>We propose a framework encompassing three critical examination recommendation, diagnostic decision-making, and treatment planning, simulating the entire patient care journey.<n>Using this benchmark, we evaluate five state-of-the-art reasoning LLMs, including DeepSeek-R1, OpenAI-o3-mini, and Gemini-2.0-Flash Thinking, etc.
arXiv Detail & Related papers (2025-03-06T18:35:39Z) - PanguIR Technical Report for NTCIR-18 AEOLLM Task [12.061652026366591]
Large language models (LLMs) are increasingly critical and challenging to evaluate.<n>Manual evaluation, while comprehensive, is often costly and resource-intensive.<n>automatic evaluation offers greater scalability but is constrained by the limitations of its evaluation criteria.
arXiv Detail & Related papers (2025-03-04T07:40:02Z) - Multi-Modal Self-Supervised Learning for Surgical Feedback Effectiveness Assessment [66.6041949490137]
We propose a method that integrates information from transcribed verbal feedback and corresponding surgical video to predict feedback effectiveness.
Our findings show that both transcribed feedback and surgical video are individually predictive of trainee behavior changes.
Our results demonstrate the potential of multi-modal learning to advance the automated assessment of surgical feedback.
arXiv Detail & Related papers (2024-11-17T00:13:00Z) - Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care [46.2482873419289]
We introduce a deep Q-learning approach to obtain more reliable critical care policies.
We evaluate our method in off-policy and offline settings using simulated environments and real health records from intensive care units.
arXiv Detail & Related papers (2023-06-13T18:02:57Z) - A study on the impact of Self-Supervised Learning on automatic dysarthric speech assessment [6.284142286798582]
We show that HuBERT is the most versatile feature extractor across dysarthria classification, word recognition, and intelligibility classification, achieving respectively $+24.7%, +61%, textand +7.2%$ accuracy compared to classical acoustic features.
arXiv Detail & Related papers (2023-06-07T11:04:02Z) - Towards Stroke Patients' Upper-limb Automatic Motor Assessment Using
Smartwatches [5.132618393976799]
We aim to design an upper-limb assessment pipeline for stroke patients using smartwatches.
Our main target is to automatically detect and recognize four key movements inspired by the Fugl-Meyer assessment scale.
arXiv Detail & Related papers (2022-12-09T14:00:49Z) - Video-based Formative and Summative Assessment of Surgical Tasks using
Deep Learning [0.8612287536028312]
We propose a deep learning (DL) model that can automatically and objectively provide a high-stakes summative assessment of surgical skill execution.
Formative assessment is generated using heatmaps of visual features that correlate with surgical performance.
arXiv Detail & Related papers (2022-03-17T20:07:48Z) - A Deep Learning Approach to Predicting Collateral Flow in Stroke
Patients Using Radiomic Features from Perfusion Images [58.17507437526425]
Collateral circulation results from specialized anastomotic channels which provide oxygenated blood to regions with compromised blood flow.
The actual grading is mostly done through manual inspection of the acquired images.
We present a deep learning approach to predicting collateral flow grading in stroke patients based on radiomic features extracted from MR perfusion data.
arXiv Detail & Related papers (2021-10-24T18:58:40Z) - Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states.
VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - Opportunities of a Machine Learning-based Decision Support System for
Stroke Rehabilitation Assessment [64.52563354823711]
Rehabilitation assessment is critical to determine an adequate intervention for a patient.
Current practices of assessment mainly rely on therapist's experience, and assessment is infrequently executed due to the limited availability of a therapist.
We developed an intelligent decision support system that can identify salient features of assessment using reinforcement learning.
arXiv Detail & Related papers (2020-02-27T17:04:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.