Related papers: Open-Source Multi-Viewpoint Surgical Telerobotics

Open-Source Multi-Viewpoint Surgical Telerobotics

URL: http://arxiv.org/abs/2505.11142v1
Date: Fri, 16 May 2025 11:41:27 GMT
Title: Open-Source Multi-Viewpoint Surgical Telerobotics
Authors: Guido Caccianiga, Yarden Sharon, Bernard Javot, Senya Polikovsky, Gökce Ergün, Ivan Capobianco, André L. Mihaljevic, Anton Deguet, Katherine J. Kuchenbecker,
Abstract summary: We conjecture that introducing one or more adjustable viewpoints in the abdominal cavity would unlock novel visualization and collaboration strategies for surgeons.<n>We are building a synchronized multi-viewpoint, multi-sensor robotic surgery system by integrating high-performance vision components and upgrading the da Vinci Research Kit control logic.
Score: 6.763309989418208
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As robots for minimally invasive surgery (MIS) gradually become more accessible and modular, we believe there is a great opportunity to rethink and expand the visualization and control paradigms that have characterized surgical teleoperation since its inception. We conjecture that introducing one or more additional adjustable viewpoints in the abdominal cavity would not only unlock novel visualization and collaboration strategies for surgeons but also substantially boost the robustness of machine perception toward shared autonomy. Immediate advantages include controlling a second viewpoint and teleoperating surgical tools from a different perspective, which would allow collaborating surgeons to adjust their views independently and still maneuver their robotic instruments intuitively. Furthermore, we believe that capturing synchronized multi-view 3D measurements of the patient's anatomy would unlock advanced scene representations. Accurate real-time intraoperative 3D perception will allow algorithmic assistants to directly control one or more robotic instruments and/or robotic cameras. Toward these goals, we are building a synchronized multi-viewpoint, multi-sensor robotic surgery system by integrating high-performance vision components and upgrading the da Vinci Research Kit control logic. This short paper reports a functional summary of our setup and elaborates on its potential impacts in research and future clinical practice. By fully open-sourcing our system, we will enable the research community to reproduce our setup, improve it, and develop powerful algorithms, effectively boosting clinical translation of cutting-edge research.

Related papers

Multi-view Video-Pose Pretraining for Operating Room Surgical Activity Recognition [5.787586057526269]
Surgical activity recognition is a key computer vision task that detects activities or phases from multi-view camera recordings.<n>Existing SAR models often fail to account for fine-grained clinician movements and multi-view knowledge.<n>We propose a novel calibration-free multi-view multi-modal pretraining framework called Multiview Pretraining for Video-Pose Surgical Activity Recognition PreViPS.
arXiv Detail & Related papers (2025-02-19T17:08:04Z)
EasyVis2: A Real Time Multi-view 3D Visualization System for Laparoscopic Surgery Training Enhanced by a Deep Neural Network YOLOv8-Pose [3.8000041849498127]
EasyVis2 is a system designed to provide hands-free, real-time 3D visualization for laparoscopic surgery.<n>It incorporates a surgical trocar equipped with an array of micro-cameras, which can be inserted into the body cavity to offer a 3D perspective.<n>A specialized deep neural network algorithm, YOLOv8-Pose, is utilized to estimate the position and orientation of surgical instruments in each individual camera view.
arXiv Detail & Related papers (2024-12-21T19:26:19Z)
Creating a Digital Twin of Spinal Surgery: A Proof of Concept [68.37190859183663]
Surgery digitalization is the process of creating a virtual replica of real-world surgery. We present a proof of concept (PoC) for surgery digitalization that is applied to an ex-vivo spinal surgery. We employ five RGB-D cameras for dynamic 3D reconstruction of the surgeon, a high-end camera for 3D reconstruction of the anatomy, an infrared stereo camera for surgical instrument tracking, and a laser scanner for 3D reconstruction of the operating room and data fusion.
arXiv Detail & Related papers (2024-03-25T13:09:40Z)
Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose Estimation of Surgical Instruments [64.59698930334012]
We present a multi-camera capture setup consisting of static and head-mounted cameras.<n>Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre.<n>Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z)
Robotic Navigation Autonomy for Subretinal Injection via Intelligent Real-Time Virtual iOCT Volume Slicing [88.99939660183881]
We propose a framework for autonomous robotic navigation for subretinal injection. Our method consists of an instrument pose estimation method, an online registration between the robotic and the i OCT system, and trajectory planning tailored for navigation to an injection target. Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method.
arXiv Detail & Related papers (2023-01-17T21:41:21Z)
See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation [49.925499720323806]
We study how visual, auditory, and tactile perception can jointly help robots to solve complex manipulation tasks. We build a robot system that can see with a camera, hear with a contact microphone, and feel with a vision-based tactile sensor.
arXiv Detail & Related papers (2022-12-07T18:55:53Z)
Using Hand Pose Estimation To Automate Open Surgery Training Feedback [0.0]
This research aims to facilitate the use of state-of-the-art computer vision algorithms for the automated training of surgeons. By estimating 2D hand poses, we model the movement of the practitioner's hands, and their interaction with surgical instruments.
arXiv Detail & Related papers (2022-11-13T21:47:31Z)
Robotic Surgery Remote Mentoring via AR with 3D Scene Streaming and Hand Interaction [14.64569748299962]
We propose a novel AR-based robotic surgery remote mentoring system with efficient 3D scene visualization and natural 3D hand interaction. Using a head-mounted display (i.e., HoloLens), the mentor can remotely monitor the procedure streamed from the trainee's operation side. We validate the system on both real surgery stereo videos and ex-vivo scenarios of common robotic training tasks.
arXiv Detail & Related papers (2022-04-09T03:17:15Z)
Integrating Artificial Intelligence and Augmented Reality in Robotic Surgery: An Initial dVRK Study Using a Surgical Education Scenario [15.863254207155835]
We develop a novel robotic surgery education system by integrating artificial intelligence surgical module and augmented reality visualization. The proposed system is evaluated through a preliminary experiment on surgical education task peg-transfer.
arXiv Detail & Related papers (2022-01-02T17:34:10Z)
Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical Procedures [70.69948035469467]
We take advantage of the latest computer vision methodologies for generating 3D graphs from camera views. We then introduce the Multimodal Semantic Graph Scene (MSSG) which aims at providing unified symbolic and semantic representation of surgical procedures.
arXiv Detail & Related papers (2021-06-09T14:35:44Z)
Mapping Surgeon's Hand/Finger Motion During Conventional Microsurgery to Enhance Intuitive Surgical Robot Teleoperation [0.5635300481123077]
Current human-robot interfaces lack intuitive teleoperation and cannot mimic surgeon's hand/finger sensing and fine motion. We report a pilot study showing an intuitive way of recording and mapping surgeon's gross hand motion and the fine synergic motion during cardiac micro-surgery.
arXiv Detail & Related papers (2021-02-21T11:21:30Z)
Relational Graph Learning on Visual and Kinematics Embeddings for Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information. The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.