Explainable AI for Automated User-specific Feedback in Surgical Skill Acquisition
- URL: http://arxiv.org/abs/2508.02593v1
- Date: Mon, 04 Aug 2025 16:48:44 GMT
- Title: Explainable AI for Automated User-specific Feedback in Surgical Skill Acquisition
- Authors: Catalina Gomez, Lalithkumar Seenivasan, Xinrui Zou, Jeewoo Yoon, Sirui Chu, Ariel Leong, Patrick Kramer, Yu-Chun Ku, Jose L. Porras, Alejandro Martin-Gomez, Masaru Ishii, Mathias Unberath,
- Abstract summary: We examine the effectiveness of explainable AI (XAI)-generated feedback in surgical training through a human-AI study.<n>We compare the impact of XAI-guided feedback against traditional video-based coaching on task outcomes, cognitive load, and trainees' perceptions of AI-assisted learning.
- Score: 38.38538970682482
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional surgical skill acquisition relies heavily on expert feedback, yet direct access is limited by faculty availability and variability in subjective assessments. While trainees can practice independently, the lack of personalized, objective, and quantitative feedback reduces the effectiveness of self-directed learning. Recent advances in computer vision and machine learning have enabled automated surgical skill assessment, demonstrating the feasibility of automatic competency evaluation. However, it is unclear whether such Artificial Intelligence (AI)-driven feedback can contribute to skill acquisition. Here, we examine the effectiveness of explainable AI (XAI)-generated feedback in surgical training through a human-AI study. We create a simulation-based training framework that utilizes XAI to analyze videos and extract surgical skill proxies related to primitive actions. Our intervention provides automated, user-specific feedback by comparing trainee performance to expert benchmarks and highlighting deviations from optimal execution through understandable proxies for actionable guidance. In a prospective user study with medical students, we compare the impact of XAI-guided feedback against traditional video-based coaching on task outcomes, cognitive load, and trainees' perceptions of AI-assisted learning. Results showed improved cognitive load and confidence post-intervention. While no differences emerged between the two feedback types in reducing performance gaps or practice adjustments, trends in the XAI group revealed desirable effects where participants more closely mimicked expert practice. This work encourages the study of explainable AI in surgical education and the development of data-driven, adaptive feedback mechanisms that could transform learning experiences and competency assessment.
Related papers
- Multi-Modal Self-Supervised Learning for Surgical Feedback Effectiveness Assessment [66.6041949490137]
We propose a method that integrates information from transcribed verbal feedback and corresponding surgical video to predict feedback effectiveness.
Our findings show that both transcribed feedback and surgical video are individually predictive of trainee behavior changes.
Our results demonstrate the potential of multi-modal learning to advance the automated assessment of surgical feedback.
arXiv Detail & Related papers (2024-11-17T00:13:00Z) - How Performance Pressure Influences AI-Assisted Decision Making [57.53469908423318]
We show how pressure and explainable AI (XAI) techniques interact with AI advice-taking behavior.<n>Our results show complex interaction effects, with different combinations of pressure and XAI techniques either improving or worsening AI advice taking behavior.
arXiv Detail & Related papers (2024-10-21T22:39:52Z) - Study on the Helpfulness of Explainable Artificial Intelligence [0.0]
Legal, business, and ethical requirements motivate using effective XAI.
We propose to evaluate XAI methods via the user's ability to successfully perform a proxy task.
In other words, we address the helpfulness of XAI for human decision-making.
arXiv Detail & Related papers (2024-10-14T14:03:52Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Knowing About Knowing: An Illusion of Human Competence Can Hinder
Appropriate Reliance on AI Systems [13.484359389266864]
This paper addresses whether the Dunning-Kruger Effect (DKE) can hinder appropriate reliance on AI systems.
DKE is a metacognitive bias due to which less-competent individuals overestimate their own skill and performance.
We found that participants who overestimate their performance tend to exhibit under-reliance on AI systems.
arXiv Detail & Related papers (2023-01-25T14:26:10Z) - Towards Human Cognition Level-based Experiment Design for Counterfactual
Explanations (XAI) [68.8204255655161]
The emphasis of XAI research appears to have turned to a more pragmatic explanation approach for better understanding.
An extensive area where cognitive science research may substantially influence XAI advancements is evaluating user knowledge and feedback.
We propose a framework to experiment with generating and evaluating the explanations on the grounds of different cognitive levels of understanding.
arXiv Detail & Related papers (2022-10-31T19:20:22Z) - Advancing Human-AI Complementarity: The Impact of User Expertise and
Algorithmic Tuning on Joint Decision Making [10.890854857970488]
Many factors can impact success of Human-AI teams, including a user's domain expertise, mental models of an AI system, trust in recommendations, and more.
Our study examined user performance in a non-trivial blood vessel labeling task where participants indicated whether a given blood vessel was flowing or stalled.
Our results show that while recommendations from an AI-Assistant can aid user decision making, factors such as users' baseline performance relative to the AI and complementary tuning of AI error types significantly impact overall team performance.
arXiv Detail & Related papers (2022-08-16T21:39:58Z) - Enabling AI and Robotic Coaches for Physical Rehabilitation Therapy:
Iterative Design and Evaluation with Therapists and Post-Stroke Survivors [66.07833535962762]
Artificial intelligence (AI) and robotic coaches promise the improved engagement of patients on rehabilitation exercises through social interaction.
Previous work explored the potential of automatically monitoring exercises for AI and robotic coaches, but deployment remains a challenge.
We present our efforts on eliciting the detailed design specifications on how AI and robotic coaches could interact with and guide patient's exercises.
arXiv Detail & Related papers (2021-06-15T22:06:39Z) - Proxy Tasks and Subjective Measures Can Be Misleading in Evaluating
Explainable AI Systems [14.940404609343432]
We evaluate two currently common techniques for evaluating XAI systems.
We show that evaluations with proxy tasks did not predict the results of the evaluations with the actual decision-making tasks.
Our results suggest that by employing misleading evaluation methods, our field may be inadvertently slowing its progress toward developing human+AI teams that can reliably perform better than humans or AIs alone.
arXiv Detail & Related papers (2020-01-22T22:14:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.