Related papers: A Unified XAI-LLM Approach for EndotrachealSuctioning Activity Recognition

A Unified XAI-LLM Approach for EndotrachealSuctioning Activity Recognition

URL: http://arxiv.org/abs/2601.21802v1
Date: Thu, 29 Jan 2026 14:46:48 GMT
Title: A Unified XAI-LLM Approach for EndotrachealSuctioning Activity Recognition
Authors: Hoang Khang Phan, Quang Vinh Dang, Noriyo Colley, Christina Garcia, Nhat Tan Le,
Abstract summary: This study proposes a unified framework for video-based activity recognition benchmarked against conventional machine learning and deep learning approaches.<n>Within this framework, the Large Language Model (LLM) serves as the central reasoning module, performing bothtemporal activity recognition and explainable decision analysis from video data.<n> Experimental results demonstrate that the proposed LLM-based approach outperforms baseline models, achieving an improvement of approximately 15-20% in both accuracy and F1 score.
Score: 0.1794226570005898
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Endotracheal suctioning (ES) is an invasive yet essential clinical procedure that requires a high degree of skill to minimize patient risk - particularly in home care and educational settings, where consistent supervision may be limited. Despite its critical importance, automated recognition and feedback systems for ES training remain underexplored. To address this gap, this study proposes a unified, LLM-centered framework for video-based activity recognition benchmarked against conventional machine learning and deep learning approaches, and a pilot study on feedback generation. Within this framework, the Large Language Model (LLM) serves as the central reasoning module, performing both spatiotemporal activity recognition and explainable decision analysis from video data. Furthermore, the LLM is capable of verbalizing feedback in natural language, thereby translating complex technical insights into accessible, human-understandable guidance for trainees. Experimental results demonstrate that the proposed LLM-based approach outperforms baseline models, achieving an improvement of approximately 15-20\% in both accuracy and F1 score. Beyond recognition, the framework incorporates a pilot student-support module built upon anomaly detection and explainable AI (XAI) principles, which provides automated, interpretable feedback highlighting correct actions and suggesting targeted improvements. Collectively, these contributions establish a scalable, interpretable, and data-driven foundation for advancing nursing education, enhancing training efficiency, and ultimately improving patient safety.

Related papers

A Closed-Loop Personalized Learning Agent Integrating Neural Cognitive Diagnosis, Bounded-Ability Adaptive Testing, and LLM-Driven Feedback [5.190121417265426]
This paper presents an end-to-end personalized learning agent, which integrates a Neural Cognitive Diagnosis model (NCD), a Bounded-Ability Computerized Adaptive Testing strategy (BECAT), and large language models (LLMs)<n> Experiments on the ASSISTments dataset show that the NCD module achieves strong performance on response prediction while yielding interpretable mastery assessments.<n>Overall, the results indicate that the proposed design is effective and practically deployable.
arXiv Detail & Related papers (2025-10-26T07:32:31Z)
Are Large Language Models Dynamic Treatment Planners? An In Silico Study from a Prior Knowledge Injection Angle [3.0391297540732545]
We evaluate large language models (LLMs) as dynamic insulin dosing agents in an in silico Type 1 diabetes simulator.<n>Our results indicate that carefully designed zero-shot prompts enable smaller LLMs to achieve comparable or superior clinical performance.<n>LLMs exhibit notable limitations, such as overly aggressive insulin dosing when prompted with chain-of-thought.
arXiv Detail & Related papers (2025-08-06T13:46:02Z)
Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications [59.721265428780946]
Large Language Models (LLMs) in medicine have enabled impressive capabilities, yet a critical gap remains in their ability to perform systematic, transparent, and verifiable reasoning.<n>This paper provides the first systematic review of this emerging field.<n>We propose a taxonomy of reasoning enhancement techniques, categorized into training-time strategies and test-time mechanisms.
arXiv Detail & Related papers (2025-08-01T14:41:31Z)
GEMeX-RMCoT: An Enhanced Med-VQA Dataset for Region-Aware Multimodal Chain-of-Thought Reasoning [60.03671205298294]
Medical visual question answering aims to support clinical decision-making by enabling models to answer natural language questions based on medical images.<n>Current methods still suffer from limited answer reliability and poor interpretability.<n>This work first proposes a Region-Aware Multimodal Chain-of-Thought dataset, in which the process of producing an answer is preceded by a sequence of intermediate reasoning steps.
arXiv Detail & Related papers (2025-06-22T08:09:58Z)
Adversarial Prompt Distillation for Vision-Language Models [61.39214202062028]
Adversarial Prompt Tuning (APT) applies adversarial training during the process of prompt tuning.<n>APD is a bimodal knowledge distillation framework that enhances APT by integrating it with multi-modal knowledge transfer.<n>Extensive experiments on multiple benchmark datasets demonstrate the superiority of our APD method over the current state-of-the-art APT methods.
arXiv Detail & Related papers (2024-11-22T03:02:13Z)
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset [92.99416966226724]
We introduce Facial Identity Unlearning Benchmark (FIUBench), a novel VLM unlearning benchmark designed to robustly evaluate the effectiveness of unlearning algorithms.<n>We apply a two-stage evaluation pipeline that is designed to precisely control the sources of information and their exposure levels.<n>Through the evaluation of four baseline VLM unlearning algorithms within FIUBench, we find that all methods remain limited in their unlearning performance.
arXiv Detail & Related papers (2024-11-05T23:26:10Z)
PALLM: Evaluating and Enhancing PALLiative Care Conversations with Large Language Models [10.258261180305439]
Large language models (LLMs) offer a new approach to assessing complex communication metrics. LLMs offer the potential to advance the field through integration into passive sensing and just-in-time intervention systems. This study explores LLMs as evaluators of palliative care communication quality, leveraging their linguistic, in-context learning, and reasoning capabilities.
arXiv Detail & Related papers (2024-09-23T16:39:12Z)
Automatic Interactive Evaluation for Large Language Models with State Aware Patient Simulator [21.60103376506254]
Large Language Models (LLMs) have demonstrated remarkable proficiency in human interactions. This paper introduces the Automated Interactive Evaluation (AIE) framework and the State-Aware Patient Simulator (SAPS) AIE and SAPS provide a dynamic, realistic platform for assessing LLMs through multi-turn doctor-patient simulations.
arXiv Detail & Related papers (2024-03-13T13:04:58Z)
Natural Language Programming in Medicine: Administering Evidence Based Clinical Workflows with Autonomous Agents Powered by Generative Large Language Models [29.05425041393475]
Generative Large Language Models (LLMs) hold significant promise in healthcare. This study assessed the potential of LLMs to function as autonomous agents in a simulated tertiary care medical center.
arXiv Detail & Related papers (2024-01-05T15:09:57Z)
Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning. They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health. Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z)
Automated Fidelity Assessment for Strategy Training in Inpatient Rehabilitation using Natural Language Processing [53.096237570992294]
Strategy training is a rehabilitation approach that teaches skills to reduce disability among those with cognitive impairments following a stroke. Standardized fidelity assessment is used to measure adherence to treatment principles. We developed a rule-based NLP algorithm, a long-short term memory (LSTM) model, and a bidirectional encoder representation from transformers (BERT) model for this task.
arXiv Detail & Related papers (2022-09-14T15:33:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.