Robotic Surgery Remote Mentoring via AR with 3D Scene Streaming and Hand
Interaction
- URL: http://arxiv.org/abs/2204.04377v2
- Date: Thu, 29 Feb 2024 08:42:47 GMT
- Title: Robotic Surgery Remote Mentoring via AR with 3D Scene Streaming and Hand
Interaction
- Authors: Yonghao Long, Chengkun Li, and Qi Dou
- Abstract summary: We propose a novel AR-based robotic surgery remote mentoring system with efficient 3D scene visualization and natural 3D hand interaction.
Using a head-mounted display (i.e., HoloLens), the mentor can remotely monitor the procedure streamed from the trainee's operation side.
We validate the system on both real surgery stereo videos and ex-vivo scenarios of common robotic training tasks.
- Score: 14.64569748299962
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the growing popularity of robotic surgery, education becomes
increasingly important and urgently needed for the sake of patient safety.
However, experienced surgeons have limited accessibility due to their busy
clinical schedule or working in a distant city, thus can hardly provide
sufficient education resources for novices. Remote mentoring, as an effective
way, can help solve this problem, but traditional methods are limited to plain
text, audio, or 2D video, which are not intuitive nor vivid. Augmented reality
(AR), a thriving technique being widely used for various education scenarios,
is promising to offer new possibilities of visual experience and interactive
teaching. In this paper, we propose a novel AR-based robotic surgery remote
mentoring system with efficient 3D scene visualization and natural 3D hand
interaction. Using a head-mounted display (i.e., HoloLens), the mentor can
remotely monitor the procedure streamed from the trainee's operation side. The
mentor can also provide feedback directly with hand gestures, which is in-turn
transmitted to the trainee and viewed in surgical console as guidance. We
comprehensively validate the system on both real surgery stereo videos and
ex-vivo scenarios of common robotic training tasks (i.e., peg-transfer and
suturing). Promising results are demonstrated regarding the fidelity of
streamed scene visualization, the accuracy of feedback with hand interaction,
and the low-latency of each component in the entire remote mentoring system.
This work showcases the feasibility of leveraging AR technology for reliable,
flexible and low-cost solutions to robotic surgical education, and holds great
potential for clinical applications.
Related papers
- Open-TeleVision: Teleoperation with Immersive Active Visual Feedback [17.505318269362512]
Open-TeleVision allows operators to actively perceive the robot's surroundings in a stereoscopic manner.
The system mirrors the operator's arm and hand movements on the robot, creating an immersive experience.
We validate the effectiveness of our system by collecting data and training imitation learning policies on four long-horizon, precise tasks.
arXiv Detail & Related papers (2024-07-01T17:55:35Z) - Mixed Reality Communication for Medical Procedures: Teaching the
Placement of a Central Venous Catheter [5.0939439129897535]
We present a mixed reality real-time communication system to increase access to procedural skill training and to improve remote emergency assistance.
RGBD cameras capture a volumetric view of the local scene including the patient, the operator, and the medical equipment.
The volumetric capture is augmented onto the remote expert's view to allow the expert to spatially guide the local operator using visual and verbal instructions.
arXiv Detail & Related papers (2023-12-14T03:11:20Z) - Human-oriented Representation Learning for Robotic Manipulation [64.59499047836637]
Humans inherently possess generalizable visual representations that empower them to efficiently explore and interact with the environments in manipulation tasks.
We formalize this idea through the lens of human-oriented multi-task fine-tuning on top of pre-trained visual encoders.
Our Task Fusion Decoder consistently improves the representation of three state-of-the-art visual encoders for downstream manipulation policy-learning.
arXiv Detail & Related papers (2023-10-04T17:59:38Z) - Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures [51.78027546947034]
Recent advancements in surgical computer vision have been driven by vision-only models, which lack language semantics.
We propose leveraging surgical video lectures from e-learning platforms to provide effective vision and language supervisory signals.
We address surgery-specific linguistic challenges using multiple automatic speech recognition systems for text transcriptions.
arXiv Detail & Related papers (2023-07-27T22:38:12Z) - Surgical tool classification and localization: results and methods from
the MICCAI 2022 SurgToolLoc challenge [69.91670788430162]
We present the results of the SurgLoc 2022 challenge.
The goal was to leverage tool presence data as weak labels for machine learning models trained to detect tools.
We conclude by discussing these results in the broader context of machine learning and surgical data science.
arXiv Detail & Related papers (2023-05-11T21:44:39Z) - Dexterous Manipulation from Images: Autonomous Real-World RL via Substep
Guidance [71.36749876465618]
We describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks.
Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples.
experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world.
arXiv Detail & Related papers (2022-12-19T22:50:40Z) - Video-based Surgical Skills Assessment using Long term Tool Tracking [0.3324986723090368]
We introduce a motion-based approach to automatically assess surgical skills from surgical case video feed.
The proposed pipeline first tracks surgical tools reliably to create motion trajectories.
We compare transformer-based skill assessment with traditional machine learning approaches using the proposed and state-of-the-art tracking.
arXiv Detail & Related papers (2022-07-05T18:15:28Z) - DexVIP: Learning Dexterous Grasping with Human Hand Pose Priors from
Video [86.49357517864937]
We propose DexVIP, an approach to learn dexterous robotic grasping from human-object interaction videos.
We do this by curating grasp images from human-object interaction videos and imposing a prior over the agent's hand pose.
We demonstrate that DexVIP compares favorably to existing approaches that lack a hand pose prior or rely on specialized tele-operation equipment.
arXiv Detail & Related papers (2022-02-01T00:45:57Z) - Integrating Artificial Intelligence and Augmented Reality in Robotic
Surgery: An Initial dVRK Study Using a Surgical Education Scenario [15.863254207155835]
We develop a novel robotic surgery education system by integrating artificial intelligence surgical module and augmented reality visualization.
The proposed system is evaluated through a preliminary experiment on surgical education task peg-transfer.
arXiv Detail & Related papers (2022-01-02T17:34:10Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.