Localising under the drape: proprioception in the era of distributed surgical robotic system
- URL: http://arxiv.org/abs/2510.23512v1
- Date: Mon, 27 Oct 2025 16:50:12 GMT
- Title: Localising under the drape: proprioception in the era of distributed surgical robotic system
- Authors: Martin Huber, Nicola A. Cavalcanti, Ayoob Davoodi, Ruixuan Li, Christopher E. Mower, Fabio Carrillo, Christoph J. Laux, Francois Teyssere, Thibault Chandanson, Antoine Harlé, Elie Saghbiny, Mazda Farshad, Guillaume Morel, Emmanuel Vander Poorten, Philipp Fürnstahl, Sébastien Ourselin, Christos Bergeles, Tom Vercauteren,
- Abstract summary: We present a marker-free proprioception method that enables precise localisation of surgical robots under their sterile draping.<n>Our method relies on lightweight stereo-RGB cameras and novel transformer-based deep learning models.<n>It builds on the largest multi-centre spatial robotic surgery dataset to date.
- Score: 12.001086860486906
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite their mechanical sophistication, surgical robots remain blind to their surroundings. This lack of spatial awareness causes collisions, system recoveries, and workflow disruptions, issues that will intensify with the introduction of distributed robots with independent interacting arms. Existing tracking systems rely on bulky infrared cameras and reflective markers, providing only limited views of the surgical scene and adding hardware burden in crowded operating rooms. We present a marker-free proprioception method that enables precise localisation of surgical robots under their sterile draping despite associated obstruction of visual cues. Our method solely relies on lightweight stereo-RGB cameras and novel transformer-based deep learning models. It builds on the largest multi-centre spatial robotic surgery dataset to date (1.4M self-annotated images from human cadaveric and preclinical in vivo studies). By tracking the entire robot and surgical scene, rather than individual markers, our approach provides a holistic view robust to occlusions, supporting surgical scene understanding and context-aware control. We demonstrate an example of potential clinical benefits during in vivo breathing compensation with access to tissue dynamics, unobservable under state of the art tracking, and accurately locate in multi-robot systems for future intelligent interaction. In addition, and compared with existing systems, our method eliminates markers and improves tracking visibility by 25%. To our knowledge, this is the first demonstration of marker-free proprioception for fully draped surgical robots, reducing setup complexity, enhancing safety, and paving the way toward modular and autonomous robotic surgery.
Related papers
- Give me scissors: Collision-Free Dual-Arm Surgical Assistive Robot for Instrument Delivery [4.187693745535412]
During surgery, scrub nurses are required to frequently deliver surgical instruments to surgeons.<n>Existing research on robotic scrub nurses relies on predefined paths for instrument delivery.<n>We present a collision-free dual-arm surgical assistive robot capable of performing instrument delivery.
arXiv Detail & Related papers (2026-03-03T03:14:59Z) - RGA-Net: A Vision Enhancement Framework for Robotic Surgical Systems Using Reciprocal Attention Mechanisms [25.435178288442597]
RGA-Net is a novel deep learning framework specifically designed for smoke removal in robotic surgery.<n>Our approach addresses the challenges of surgical smoke-including dense, non-homogeneous distribution and complex light scattering-through a hierarchical encoder-decoder architecture.
arXiv Detail & Related papers (2026-02-14T11:31:54Z) - RobotSeg: A Model and Dataset for Segmenting Robots in Image and Video [56.9581053843815]
We introduce RobotSeg, a foundation model for robot segmentation in image and video.<n>It addresses the lack of adaptation to articulated robots, reliance on manual prompts, and the need for per-frame training mask annotations.<n>It achieves state-of-the-art performance on both images and videos.
arXiv Detail & Related papers (2025-11-28T07:51:02Z) - Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers [2.736848514829367]
Human vision is a highly active process driven by gaze, which directs attention to task-relevant regions through foveation.<n>In this work, we explore how incorporating human-like active gaze into robotic policies can enhance efficiency and robustness.<n>We develop GIAVA, a robot vision system that emulates human head and neck movement, and gaze adjustment for foveated processing.
arXiv Detail & Related papers (2025-07-21T17:44:10Z) - Open-Source Multi-Viewpoint Surgical Telerobotics [6.763309989418208]
We conjecture that introducing one or more adjustable viewpoints in the abdominal cavity would unlock novel visualization and collaboration strategies for surgeons.<n>We are building a synchronized multi-viewpoint, multi-sensor robotic surgery system by integrating high-performance vision components and upgrading the da Vinci Research Kit control logic.
arXiv Detail & Related papers (2025-05-16T11:41:27Z) - Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics.
Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens.
We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z) - Robotic Navigation Autonomy for Subretinal Injection via Intelligent
Real-Time Virtual iOCT Volume Slicing [88.99939660183881]
We propose a framework for autonomous robotic navigation for subretinal injection.
Our method consists of an instrument pose estimation method, an online registration between the robotic and the i OCT system, and trajectory planning tailored for navigation to an injection target.
Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method.
arXiv Detail & Related papers (2023-01-17T21:41:21Z) - See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation [49.925499720323806]
We study how visual, auditory, and tactile perception can jointly help robots to solve complex manipulation tasks.
We build a robot system that can see with a camera, hear with a contact microphone, and feel with a vision-based tactile sensor.
arXiv Detail & Related papers (2022-12-07T18:55:53Z) - Cognitive architecture aided by working-memory for self-supervised
multi-modal humans recognition [54.749127627191655]
The ability to recognize human partners is an important social skill to build personalized and long-term human-robot interactions.
Deep learning networks have achieved state-of-the-art results and demonstrated to be suitable tools to address such a task.
One solution is to make robots learn from their first-hand sensory data with self-supervision.
arXiv Detail & Related papers (2021-03-16T13:50:24Z) - Using Conditional Generative Adversarial Networks to Reduce the Effects
of Latency in Robotic Telesurgery [0.0]
In surgery, any micro-delay can injure a patient severely and in some cases, result in fatality.
Current surgical robots use calibrated sensors to measure the position of the arms and tools.
In this work we present a purely optical approach that provides a measurement of the tool position in relation to the patient's tissues.
arXiv Detail & Related papers (2020-10-07T13:40:44Z) - Morphology-Agnostic Visual Robotic Control [76.44045983428701]
MAVRIC is an approach that works with minimal prior knowledge of the robot's morphology.
We demonstrate our method on visually-guided 3D point reaching, trajectory following, and robot-to-robot imitation.
arXiv Detail & Related papers (2019-12-31T15:45:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.