Related papers: SurgBox: Agent-Driven Operating Room Sandbox with Surgery Copilot

SurgBox: Agent-Driven Operating Room Sandbox with Surgery Copilot

URL: http://arxiv.org/abs/2412.05187v1
Date: Fri, 06 Dec 2024 17:07:27 GMT
Title: SurgBox: Agent-Driven Operating Room Sandbox with Surgery Copilot
Authors: Jinlin Wu, Xusheng Liang, Xuexue Bai, Zhen Chen,
Abstract summary: SurgBox is an agent-driven sandbox framework to enhance cognitive capabilities of surgeons in immersive surgical simulations.<n>In particular, we devise Surgery Copilot, an AI-driven assistant to actively coordinate the surgical information stream and support clinical decision-making.
Score: 3.487327636814225
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Surgical interventions, particularly in neurology, represent complex and high-stakes scenarios that impose substantial cognitive burdens on surgical teams. Although deliberate education and practice can enhance cognitive capabilities, surgical training opportunities remain limited due to patient safety concerns. To address these cognitive challenges in surgical training and operation, we propose SurgBox, an agent-driven sandbox framework to systematically enhance the cognitive capabilities of surgeons in immersive surgical simulations. Specifically, our SurgBox leverages large language models (LLMs) with tailored Retrieval-Augmented Generation (RAG) to authentically replicate various surgical roles, enabling realistic training environments for deliberate practice. In particular, we devise Surgery Copilot, an AI-driven assistant to actively coordinate the surgical information stream and support clinical decision-making, thereby diminishing the cognitive workload of surgical teams during surgery. By incorporating a novel Long-Short Memory mechanism, our Surgery Copilot can effectively balance immediate procedural assistance with comprehensive surgical knowledge. Extensive experiments using real neurosurgical procedure records validate our SurgBox framework in both enhancing surgical cognitive capabilities and supporting clinical decision-making. By providing an integrated solution for training and operational support to address cognitive challenges, our SurgBox framework advances surgical education and practice, potentially transforming surgical outcomes and healthcare quality. The code is available at https://github.com/franciszchen/SurgBox.

Related papers

Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities [65.66373425605278]
Automated Surgical Phase Recognition (SPR) uses Artificial Intelligence (AI) to segment the surgical workflow into its key events. Previous research has focused on short and linear surgical procedures and has not explored if temporal context influences experts' ability to better classify surgical phases. This research addresses these gaps, focusing on Robot-Assisted Partial Nephrectomy (RAPN) as a highly non-linear procedure.
arXiv Detail & Related papers (2025-04-26T15:37:22Z)
EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery [52.992415247012296]
We introduce EndoChat to address various dialogue paradigms and subtasks in surgical scene understanding. Our model achieves state-of-the-art performance across five dialogue paradigms and eight surgical scene understanding tasks.
arXiv Detail & Related papers (2025-01-20T09:12:06Z)
SurgIRL: Towards Life-Long Learning for Surgical Automation by Incremental Reinforcement Learning [21.35087120482373]
We train surgical automation policies through Surgical Incremental Reinforcement Learning (SurgIRL) SurgIRL aims to (1) acquire new skills by referring to external policies (knowledge) and (2) accumulate and reuse these skills to solve multiple unseen tasks incrementally (incremental learning)
arXiv Detail & Related papers (2024-09-24T01:27:46Z)
VS-Assistant: Versatile Surgery Assistant on the Demand of Surgeons [29.783300422432763]
We propose a Versatile Surgery Assistant (VS-Assistant) that can accurately understand the surgeon's intention. We devise a surgical-Calling Tuning strategy to enable the VS-Assistant to understand surgical intentions.
arXiv Detail & Related papers (2024-05-14T02:05:36Z)
Creating a Digital Twin of Spinal Surgery: A Proof of Concept [68.37190859183663]
Surgery digitalization is the process of creating a virtual replica of real-world surgery. We present a proof of concept (PoC) for surgery digitalization that is applied to an ex-vivo spinal surgery. We employ five RGB-D cameras for dynamic 3D reconstruction of the surgeon, a high-end camera for 3D reconstruction of the anatomy, an infrared stereo camera for surgical instrument tracking, and a laser scanner for 3D reconstruction of the operating room and data fusion.
arXiv Detail & Related papers (2024-03-25T13:09:40Z)
OSSAR: Towards Open-Set Surgical Activity Recognition in Robot-assisted Surgery [13.843251369739908]
We introduce an innovative Open-Set Surgical Activity Recognition (OSSAR) framework. Our solution leverages the hyperspherical reciprocal point strategy to enhance the distinction between known and unknown classes in the feature space. To support our assertions, we establish an open-set surgical activity benchmark utilizing the public JIGSAWS dataset.
arXiv Detail & Related papers (2024-02-10T16:23:12Z)
Deep Multimodal Fusion for Surgical Feedback Classification [70.53297887843802]
We leverage a clinically-validated five-category classification of surgical feedback. We then develop a multi-label machine learning model to classify these five categories of surgical feedback from inputs of text, audio, and video modalities. The ultimate goal of our work is to help automate the annotation of real-time contextual surgical feedback at scale.
arXiv Detail & Related papers (2023-12-06T01:59:47Z)
Toward a Surgeon-in-the-Loop Ophthalmic Robotic Apprentice using Reinforcement and Imitation Learning [18.72371138886818]
We propose an image-guided approach for surgeon-centered autonomous agents during ophthalmic cataract surgery. By integrating the surgeon's actions and preferences into the training process, our approach enables the robot to implicitly learn and adapt to the individual surgeon's unique techniques.
arXiv Detail & Related papers (2023-11-29T15:00:06Z)
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures [50.09187683845788]
Recent advancements in surgical computer vision applications have been driven by vision-only models. These methods rely on manually annotated surgical videos to predict a fixed set of object categories. In this work, we put forward the idea that the surgical video lectures available through open surgical e-learning platforms can provide effective vision and language supervisory signals.
arXiv Detail & Related papers (2023-07-27T22:38:12Z)
GLSFormer : Gated - Long, Short Sequence Transformer for Step Recognition in Surgical Videos [57.93194315839009]
We propose a vision transformer-based approach to learn temporal features directly from sequence-level patches. We extensively evaluate our approach on two cataract surgery video datasets, Cataract-101 and D99, and demonstrate superior performance compared to various state-of-the-art methods.
arXiv Detail & Related papers (2023-07-20T17:57:04Z)
Demonstration-Guided Reinforcement Learning with Efficient Exploration for Task Automation of Surgical Robot [54.80144694888735]
We introduce Demonstration-guided EXploration (DEX), an efficient reinforcement learning algorithm. Our method estimates expert-like behaviors with higher values to facilitate productive interactions. Experiments on $10$ surgical manipulation tasks from SurRoL, a comprehensive surgical simulation platform, demonstrate significant improvements.
arXiv Detail & Related papers (2023-02-20T05:38:54Z)
Quantification of Robotic Surgeries with Vision-Based Deep Learning [45.165919577877695]
We propose a unified deep learning framework, entitled Roboformer, which operates exclusively on videos recorded during surgery. We validated our framework on four video-based datasets of two commonly-encountered types of steps within minimally-invasive robotic surgeries.
arXiv Detail & Related papers (2022-05-06T06:08:35Z)
CholecTriplet2021: A benchmark challenge for surgical action triplet recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.