SAMSNeRF: Segment Anything Model (SAM) Guides Dynamic Surgical Scene
Reconstruction by Neural Radiance Field (NeRF)
- URL: http://arxiv.org/abs/2308.11774v2
- Date: Tue, 6 Feb 2024 04:29:34 GMT
- Title: SAMSNeRF: Segment Anything Model (SAM) Guides Dynamic Surgical Scene
Reconstruction by Neural Radiance Field (NeRF)
- Authors: Ange Lou, Yamin Li, Xing Yao, Yike Zhang and Jack Noble
- Abstract summary: We propose a novel approach called SAMSNeRF that combines Segment Anything Model (SAM) and Neural Radiance Field (NeRF) techniques.
Our experimental results on public endoscopy surgical videos demonstrate that our approach successfully reconstructs high-fidelity dynamic surgical scenes.
- Score: 4.740415113160021
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The accurate reconstruction of surgical scenes from surgical videos is
critical for various applications, including intraoperative navigation and
image-guided robotic surgery automation. However, previous approaches, mainly
relying on depth estimation, have limited effectiveness in reconstructing
surgical scenes with moving surgical tools. To address this limitation and
provide accurate 3D position prediction for surgical tools in all frames, we
propose a novel approach called SAMSNeRF that combines Segment Anything Model
(SAM) and Neural Radiance Field (NeRF) techniques. Our approach generates
accurate segmentation masks of surgical tools using SAM, which guides the
refinement of the dynamic surgical scene reconstruction by NeRF. Our
experimental results on public endoscopy surgical videos demonstrate that our
approach successfully reconstructs high-fidelity dynamic surgical scenes and
accurately reflects the spatial information of surgical tools. Our proposed
approach can significantly enhance surgical navigation and automation by
providing surgeons with accurate 3D position information of surgical tools
during surgery.The source code will be released soon.
Related papers
- SurgicalGaussian: Deformable 3D Gaussians for High-Fidelity Surgical Scene Reconstruction [17.126895638077574]
Dynamic reconstruction of deformable tissues in endoscopic video is a key technology for robot-assisted surgery.
NeRFs struggle to capture intricate details of objects in the scene.
Our network outperforms existing method on many aspects, including rendering quality, rendering speed and GPU usage.
arXiv Detail & Related papers (2024-07-06T09:31:30Z) - Creating a Digital Twin of Spinal Surgery: A Proof of Concept [68.37190859183663]
Surgery digitalization is the process of creating a virtual replica of real-world surgery.
We present a proof of concept (PoC) for surgery digitalization that is applied to an ex-vivo spinal surgery.
We employ five RGB-D cameras for dynamic 3D reconstruction of the surgeon, a high-end camera for 3D reconstruction of the anatomy, an infrared stereo camera for surgical instrument tracking, and a laser scanner for 3D reconstruction of the operating room and data fusion.
arXiv Detail & Related papers (2024-03-25T13:09:40Z) - BASED: Bundle-Adjusting Surgical Endoscopic Dynamic Video Reconstruction using Neural Radiance Fields [5.773068487121897]
Reconstruction of deformable scenes from endoscopic videos is important for many applications.
Our work adopts the Neural Radiance Fields (NeRF) approach to learning 3D implicit representations of scenes.
We demonstrate this approach on endoscopic surgical scenes from robotic surgery.
arXiv Detail & Related papers (2023-09-27T00:20:36Z) - Neural LerPlane Representations for Fast 4D Reconstruction of Deformable
Tissues [52.886545681833596]
LerPlane is a novel method for fast and accurate reconstruction of surgical scenes under a single-viewpoint setting.
LerPlane treats surgical procedures as 4D volumes and factorizes them into explicit 2D planes of static and dynamic fields.
LerPlane shares static fields, significantly reducing the workload of dynamic tissue modeling.
arXiv Detail & Related papers (2023-05-31T14:38:35Z) - Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose
Estimation of Surgical Instruments [66.74633676595889]
We present a multi-camera capture setup consisting of static and head-mounted cameras.
Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre.
Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z) - Live image-based neurosurgical guidance and roadmap generation using
unsupervised embedding [53.992124594124896]
We present a method for live image-only guidance leveraging a large data set of annotated neurosurgical videos.
A generated roadmap encodes the common anatomical paths taken in surgeries in the training set.
We trained and evaluated the proposed method with a data set of 166 transsphenoidal adenomectomy procedures.
arXiv Detail & Related papers (2023-03-31T12:52:24Z) - Self-Supervised Surgical Instrument 3D Reconstruction from a Single
Camera Image [0.0]
An accurate 3D surgical instrument model is a prerequisite for precise predictions of the pose and depth of the instrument.
Recent single-view 3D reconstruction methods are only used in natural object reconstruction.
We propose an end-to-end surgical instrument reconstruction system -- Self-supervised Surgical Instrument Reconstruction.
arXiv Detail & Related papers (2022-11-26T03:21:31Z) - Neural Rendering for Stereo 3D Reconstruction of Deformable Tissues in
Robotic Surgery [18.150476919815382]
Reconstruction of the soft tissues in robotic surgery from endoscopic stereo videos is important for many applications.
Previous works on this task mainly rely on SLAM-based approaches, which struggle to handle complex surgical scenes.
Inspired by recent progress in neural rendering, we present a novel framework for deformable tissue reconstruction.
arXiv Detail & Related papers (2022-06-30T13:06:27Z) - CholecTriplet2021: A benchmark challenge for surgical action triplet
recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos.
We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z) - E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with
Transformer-based Stereoscopic Depth Perception [15.927060244702686]
We present an efficient reconstruction pipeline for highly dynamic surgical scenes that runs at 28 fps.
Specifically, we design a transformer-based stereoscopic depth perception for efficient depth estimation.
We evaluate the proposed pipeline on two datasets, the public Hamlyn Centre Endoscopic Video dataset and our in-house DaVinci robotic surgery dataset.
arXiv Detail & Related papers (2021-07-01T05:57:41Z) - Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical
Procedures [70.69948035469467]
We take advantage of the latest computer vision methodologies for generating 3D graphs from camera views.
We then introduce the Multimodal Semantic Graph Scene (MSSG) which aims at providing unified symbolic and semantic representation of surgical procedures.
arXiv Detail & Related papers (2021-06-09T14:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.