E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with
Transformer-based Stereoscopic Depth Perception
- URL: http://arxiv.org/abs/2107.00229v1
- Date: Thu, 1 Jul 2021 05:57:41 GMT
- Title: E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with
Transformer-based Stereoscopic Depth Perception
- Authors: Yonghao Long, Zhaoshuo Li, Chi Hang Yee, Chi Fai Ng, Russell H.
Taylor, Mathias Unberath, Qi Dou
- Abstract summary: We present an efficient reconstruction pipeline for highly dynamic surgical scenes that runs at 28 fps.
Specifically, we design a transformer-based stereoscopic depth perception for efficient depth estimation.
We evaluate the proposed pipeline on two datasets, the public Hamlyn Centre Endoscopic Video dataset and our in-house DaVinci robotic surgery dataset.
- Score: 15.927060244702686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reconstructing the scene of robotic surgery from the stereo endoscopic video
is an important and promising topic in surgical data science, which potentially
supports many applications such as surgical visual perception, robotic surgery
education and intra-operative context awareness. However, current methods are
mostly restricted to reconstructing static anatomy assuming no tissue
deformation, tool occlusion and de-occlusion, and camera movement. However,
these assumptions are not always satisfied in minimal invasive robotic
surgeries. In this work, we present an efficient reconstruction pipeline for
highly dynamic surgical scenes that runs at 28 fps. Specifically, we design a
transformer-based stereoscopic depth perception for efficient depth estimation
and a light-weight tool segmentor to handle tool occlusion. After that, a
dynamic reconstruction algorithm which can estimate the tissue deformation and
camera movement, and aggregate the information over time is proposed for
surgical scene reconstruction. We evaluate the proposed pipeline on two
datasets, the public Hamlyn Centre Endoscopic Video Dataset and our in-house
DaVinci robotic surgery dataset. The results demonstrate that our method can
recover the scene obstructed by the surgical tool and handle the movement of
camera in realistic surgical scenarios effectively at real-time speed.
Related papers
- SurgicalGaussian: Deformable 3D Gaussians for High-Fidelity Surgical Scene Reconstruction [17.126895638077574]
Dynamic reconstruction of deformable tissues in endoscopic video is a key technology for robot-assisted surgery.
NeRFs struggle to capture intricate details of objects in the scene.
Our network outperforms existing method on many aspects, including rendering quality, rendering speed and GPU usage.
arXiv Detail & Related papers (2024-07-06T09:31:30Z) - SimEndoGS: Efficient Data-driven Scene Simulation using Robotic Surgery Videos via Physics-embedded 3D Gaussians [19.590481146949685]
We introduce 3D Gaussian as a learnable representation for surgical scene, which is learned from stereo endoscopic video.
We apply the Material Point Method, which is integrated with physical properties, to the 3D Gaussians to achieve realistic scene deformations.
Results show that it can reconstruct and simulate surgical scenes from endoscopic videos efficiently-taking only a few minutes to reconstruct the surgical scene.
arXiv Detail & Related papers (2024-05-02T02:34:19Z) - Creating a Digital Twin of Spinal Surgery: A Proof of Concept [68.37190859183663]
Surgery digitalization is the process of creating a virtual replica of real-world surgery.
We present a proof of concept (PoC) for surgery digitalization that is applied to an ex-vivo spinal surgery.
We employ five RGB-D cameras for dynamic 3D reconstruction of the surgeon, a high-end camera for 3D reconstruction of the anatomy, an infrared stereo camera for surgical instrument tracking, and a laser scanner for 3D reconstruction of the operating room and data fusion.
arXiv Detail & Related papers (2024-03-25T13:09:40Z) - FLex: Joint Pose and Dynamic Radiance Fields Optimization for Stereo Endoscopic Videos [79.50191812646125]
Reconstruction of endoscopic scenes is an important asset for various medical applications, from post-surgery analysis to educational training.
We adress the challenging setup of a moving endoscope within a highly dynamic environment of deforming tissue.
We propose an implicit scene separation into multiple overlapping 4D neural radiance fields (NeRFs) and a progressive optimization scheme jointly optimizing for reconstruction and camera poses from scratch.
This improves the ease-of-use and allows to scale reconstruction capabilities in time to process surgical videos of 5,000 frames and more; an improvement of more than ten times compared to the state of the art while being agnostic to external tracking information
arXiv Detail & Related papers (2024-03-18T19:13:02Z) - Efficient Deformable Tissue Reconstruction via Orthogonal Neural Plane [58.871015937204255]
We introduce Fast Orthogonal Plane (plane) for the reconstruction of deformable tissues.
We conceptualize surgical procedures as 4D volumes, and break them down into static and dynamic fields comprised of neural planes.
This factorization iscretizes four-dimensional space, leading to a decreased memory usage and faster optimization.
arXiv Detail & Related papers (2023-12-23T13:27:50Z) - BASED: Bundle-Adjusting Surgical Endoscopic Dynamic Video Reconstruction using Neural Radiance Fields [5.773068487121897]
Reconstruction of deformable scenes from endoscopic videos is important for many applications.
Our work adopts the Neural Radiance Fields (NeRF) approach to learning 3D implicit representations of scenes.
We demonstrate this approach on endoscopic surgical scenes from robotic surgery.
arXiv Detail & Related papers (2023-09-27T00:20:36Z) - SAMSNeRF: Segment Anything Model (SAM) Guides Dynamic Surgical Scene
Reconstruction by Neural Radiance Field (NeRF) [4.740415113160021]
We propose a novel approach called SAMSNeRF that combines Segment Anything Model (SAM) and Neural Radiance Field (NeRF) techniques.
Our experimental results on public endoscopy surgical videos demonstrate that our approach successfully reconstructs high-fidelity dynamic surgical scenes.
arXiv Detail & Related papers (2023-08-22T20:31:00Z) - Neural LerPlane Representations for Fast 4D Reconstruction of Deformable
Tissues [52.886545681833596]
LerPlane is a novel method for fast and accurate reconstruction of surgical scenes under a single-viewpoint setting.
LerPlane treats surgical procedures as 4D volumes and factorizes them into explicit 2D planes of static and dynamic fields.
LerPlane shares static fields, significantly reducing the workload of dynamic tissue modeling.
arXiv Detail & Related papers (2023-05-31T14:38:35Z) - Live image-based neurosurgical guidance and roadmap generation using
unsupervised embedding [53.992124594124896]
We present a method for live image-only guidance leveraging a large data set of annotated neurosurgical videos.
A generated roadmap encodes the common anatomical paths taken in surgeries in the training set.
We trained and evaluated the proposed method with a data set of 166 transsphenoidal adenomectomy procedures.
arXiv Detail & Related papers (2023-03-31T12:52:24Z) - Neural Rendering for Stereo 3D Reconstruction of Deformable Tissues in
Robotic Surgery [18.150476919815382]
Reconstruction of the soft tissues in robotic surgery from endoscopic stereo videos is important for many applications.
Previous works on this task mainly rely on SLAM-based approaches, which struggle to handle complex surgical scenes.
Inspired by recent progress in neural rendering, we present a novel framework for deformable tissue reconstruction.
arXiv Detail & Related papers (2022-06-30T13:06:27Z) - Searching for Efficient Architecture for Instrument Segmentation in
Robotic Surgery [58.63306322525082]
Most applications rely on accurate real-time segmentation of high-resolution surgical images.
We design a light-weight and highly-efficient deep residual architecture which is tuned to perform real-time inference of high-resolution images.
arXiv Detail & Related papers (2020-07-08T21:38:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.