EasyVis2: A Real Time Multi-view 3D Visualization for Laparoscopic Surgery Training Enhanced by a Deep Neural Network YOLOv8-Pose
- URL: http://arxiv.org/abs/2412.16742v1
- Date: Sat, 21 Dec 2024 19:26:19 GMT
- Title: EasyVis2: A Real Time Multi-view 3D Visualization for Laparoscopic Surgery Training Enhanced by a Deep Neural Network YOLOv8-Pose
- Authors: Yung-Hong Sun, Gefei Shen, Jiangang Chen, Jayer Fernandes, Hongrui Jiang, Yu Hen Hu,
- Abstract summary: EasyVis2 is a system designed for hands-free, real-time 3D visualization during laparoscopic surgery.
It incorporates a surgical trocar equipped with a set of micro-cameras, which are inserted into the body cavity to provide a 3D perspective of the surgical procedure.
A sophisticated deep neural network algorithm, YOLOv8-Pose, is tailored to estimate the position and orientation of surgical instruments in each individual camera view.
- Score: 4.112728501044346
- License:
- Abstract: EasyVis2 is a system designed for hands-free, real-time 3D visualization during laparoscopic surgery. It incorporates a surgical trocar equipped with a set of micro-cameras, which are inserted into the body cavity to provide an expanded field of view and a 3D perspective of the surgical procedure. A sophisticated deep neural network algorithm, YOLOv8-Pose, is tailored to estimate the position and orientation of surgical instruments in each individual camera view. Subsequently, 3D surgical tool pose estimation is performed using associated 2D key points across multiple views. This enables the rendering of a 3D surface model of the surgical tools overlaid on the observed background scene for real-time visualization. In this study, we explain the process of developing a training dataset for new surgical tools to customize YoLOv8-Pose while minimizing labeling efforts. Extensive experiments were conducted to compare EasyVis2 with the original EasyVis, revealing that, with the same number of cameras, the new system improves 3D reconstruction accuracy and reduces computation time. Additionally, experiments with 3D rendering on real animal tissue visually demonstrated the distance between surgical tools and tissues by displaying virtual side views, indicating potential applications in real surgeries in the future.
Related papers
- MT3DNet: Multi-Task learning Network for 3D Surgical Scene Reconstruction [0.0]
In image-assisted minimally invasive surgeries (MIS), understanding surgical scenes is vital for real-time feedback to surgeons.
The challenge lies in accurately detecting, segmenting, and estimating the depth of surgical scenes depicted in high-resolution images.
A novel Multi-Task Learning (MTL) network is proposed for performing these tasks concurrently.
arXiv Detail & Related papers (2024-12-05T07:07:35Z) - MedTet: An Online Motion Model for 4D Heart Reconstruction [59.74234226055964]
We present a novel approach to reconstruction of 3D cardiac motion from sparse intraoperative data.
Existing methods can accurately reconstruct 3D organ geometries from full 3D volumetric imaging.
We propose a versatile framework for reconstructing 3D motion from such partial data.
arXiv Detail & Related papers (2024-12-03T17:18:33Z) - SLAM assisted 3D tracking system for laparoscopic surgery [22.36252790404779]
This work proposes a real-time monocular 3D tracking algorithm for post-registration tasks.
Experiments from in-vivo and ex-vivo tests demonstrate that the proposed 3D tracking system provides robust 3D tracking.
arXiv Detail & Related papers (2024-09-18T04:00:54Z) - A Review of 3D Reconstruction Techniques for Deformable Tissues in Robotic Surgery [8.909938295090827]
NeRF-based techniques have recently garnered attention for the ability to reconstruct scenes implicitly.
On the other hand, 3D-GS represents scenes explicitly using 3D Gaussians and projects them onto a 2D plane as a replacement for the complex volume rendering in NeRF.
This work explores and reviews state-of-the-art (SOTA) approaches, discussing their innovations and implementation principles.
arXiv Detail & Related papers (2024-08-08T12:51:23Z) - Creating a Digital Twin of Spinal Surgery: A Proof of Concept [68.37190859183663]
Surgery digitalization is the process of creating a virtual replica of real-world surgery.
We present a proof of concept (PoC) for surgery digitalization that is applied to an ex-vivo spinal surgery.
We employ five RGB-D cameras for dynamic 3D reconstruction of the surgeon, a high-end camera for 3D reconstruction of the anatomy, an infrared stereo camera for surgical instrument tracking, and a laser scanner for 3D reconstruction of the operating room and data fusion.
arXiv Detail & Related papers (2024-03-25T13:09:40Z) - Volumetric Environment Representation for Vision-Language Navigation [66.04379819772764]
Vision-language navigation (VLN) requires an agent to navigate through a 3D environment based on visual observations and natural language instructions.
We introduce a Volumetric Environment Representation (VER), which voxelizes the physical world into structured 3D cells.
VER predicts 3D occupancy, 3D room layout, and 3D bounding boxes jointly.
arXiv Detail & Related papers (2024-03-21T06:14:46Z) - Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose
Estimation of Surgical Instruments [66.74633676595889]
We present a multi-camera capture setup consisting of static and head-mounted cameras.
Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre.
Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z) - Self-Supervised Surgical Instrument 3D Reconstruction from a Single
Camera Image [0.0]
An accurate 3D surgical instrument model is a prerequisite for precise predictions of the pose and depth of the instrument.
Recent single-view 3D reconstruction methods are only used in natural object reconstruction.
We propose an end-to-end surgical instrument reconstruction system -- Self-supervised Surgical Instrument Reconstruction.
arXiv Detail & Related papers (2022-11-26T03:21:31Z) - Neural Groundplans: Persistent Neural Scene Representations from a
Single Image [90.04272671464238]
We present a method to map 2D image observations of a scene to a persistent 3D scene representation.
We propose conditional neural groundplans as persistent and memory-efficient scene representations.
arXiv Detail & Related papers (2022-07-22T17:41:24Z) - Neural Rendering for Stereo 3D Reconstruction of Deformable Tissues in
Robotic Surgery [18.150476919815382]
Reconstruction of the soft tissues in robotic surgery from endoscopic stereo videos is important for many applications.
Previous works on this task mainly rely on SLAM-based approaches, which struggle to handle complex surgical scenes.
Inspired by recent progress in neural rendering, we present a novel framework for deformable tissue reconstruction.
arXiv Detail & Related papers (2022-06-30T13:06:27Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.