Semantic-SuPer: A Semantic-aware Surgical Perception Framework for
Endoscopic Tissue Classification, Reconstruction, and Tracking
- URL: http://arxiv.org/abs/2210.16674v1
- Date: Sat, 29 Oct 2022 19:33:21 GMT
- Title: Semantic-SuPer: A Semantic-aware Surgical Perception Framework for
Endoscopic Tissue Classification, Reconstruction, and Tracking
- Authors: Shan Lin, Albert J. Miao, Jingpei Lu, Shunkai Yu, Zih-Yun Chiu,
Florian Richter, Michael C. Yip
- Abstract summary: We present a novel surgical perception framework, Semantic-SuPer.
It integrates geometric and semantic information to facilitate data association, 3D reconstruction, and tracking of endoscopic scenes.
- Score: 21.133420628173067
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate and robust tracking and reconstruction of the surgical scene is a
critical enabling technology toward autonomous robotic surgery. Existing
algorithms for 3D perception in surgery mainly rely on geometric information,
while we propose to also leverage semantic information inferred from the
endoscopic video using image segmentation algorithms. In this paper, we present
a novel, comprehensive surgical perception framework, Semantic-SuPer, that
integrates geometric and semantic information to facilitate data association,
3D reconstruction, and tracking of endoscopic scenes, benefiting downstream
tasks like surgical navigation. The proposed framework is demonstrated on
challenging endoscopic data with deforming tissue, showing its advantages over
our baseline and several other state-of the-art approaches. Our code and
dataset will be available at https://github.com/ucsdarclab/Python-SuPer.
Related papers
- Online 3D reconstruction and dense tracking in endoscopic videos [5.667206318889122]
3D scene reconstruction from stereo endoscopic video data is crucial for advancing surgical interventions.
We present an online framework for online, dense 3D scene reconstruction and tracking, aimed at enhancing surgical scene understanding and assisting interventions.
arXiv Detail & Related papers (2024-09-09T19:58:42Z) - Creating a Digital Twin of Spinal Surgery: A Proof of Concept [68.37190859183663]
Surgery digitalization is the process of creating a virtual replica of real-world surgery.
We present a proof of concept (PoC) for surgery digitalization that is applied to an ex-vivo spinal surgery.
We employ five RGB-D cameras for dynamic 3D reconstruction of the surgeon, a high-end camera for 3D reconstruction of the anatomy, an infrared stereo camera for surgical instrument tracking, and a laser scanner for 3D reconstruction of the operating room and data fusion.
arXiv Detail & Related papers (2024-03-25T13:09:40Z) - EndoGSLAM: Real-Time Dense Reconstruction and Tracking in Endoscopic Surgeries using Gaussian Splatting [53.38166294158047]
EndoGSLAM is an efficient approach for endoscopic surgeries, which integrates streamlined representation and differentiable Gaussianization.
Experiments show that EndoGSLAM achieves a better trade-off between intraoperative availability and reconstruction quality than traditional or neural SLAM approaches.
arXiv Detail & Related papers (2024-03-22T11:27:43Z) - Exploring Optical Flow Inclusion into nnU-Net Framework for Surgical Instrument Segmentation [1.3444601218847545]
The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information.
Optical flow (OF) is a tool commonly used in video tasks to estimate motion and represent it in a single frame, containing temporal information.
This work seeks to employ OF maps as an additional input to the nnU-Net architecture to improve its performance in the surgical instrument segmentation task.
arXiv Detail & Related papers (2024-03-15T11:36:26Z) - Dynamic Scene Graph Representation for Surgical Video [37.22552586793163]
We exploit scene graphs as a more holistic, semantically meaningful and human-readable way to represent surgical videos.
We create a scene graph dataset from semantic segmentations from the CaDIS and CATARACTS datasets.
We demonstrate the benefits of surgical scene graphs regarding the explainability and robustness of model decisions.
arXiv Detail & Related papers (2023-09-25T21:28:14Z) - Live image-based neurosurgical guidance and roadmap generation using
unsupervised embedding [53.992124594124896]
We present a method for live image-only guidance leveraging a large data set of annotated neurosurgical videos.
A generated roadmap encodes the common anatomical paths taken in surgeries in the training set.
We trained and evaluated the proposed method with a data set of 166 transsphenoidal adenomectomy procedures.
arXiv Detail & Related papers (2023-03-31T12:52:24Z) - A unified 3D framework for Organs at Risk Localization and Segmentation
for Radiation Therapy Planning [56.52933974838905]
Current medical workflow requires manual delineation of organs-at-risk (OAR)
In this work, we aim to introduce a unified 3D pipeline for OAR localization-segmentation.
Our proposed framework fully enables the exploitation of 3D context information inherent in medical imaging.
arXiv Detail & Related papers (2022-03-01T17:08:41Z) - Stereo Dense Scene Reconstruction and Accurate Laparoscope Localization
for Learning-Based Navigation in Robot-Assisted Surgery [37.14020061063255]
The computation of anatomical information and laparoscope position is a fundamental block of robot-assisted surgical navigation in Minimally Invasive Surgery (MIS)
We propose a learning-driven framework, in which an image-guided laparoscopic localization with 3D reconstructions of complex anatomical structures is hereby achieved.
arXiv Detail & Related papers (2021-10-08T06:12:18Z) - E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with
Transformer-based Stereoscopic Depth Perception [15.927060244702686]
We present an efficient reconstruction pipeline for highly dynamic surgical scenes that runs at 28 fps.
Specifically, we design a transformer-based stereoscopic depth perception for efficient depth estimation.
We evaluate the proposed pipeline on two datasets, the public Hamlyn Centre Endoscopic Video dataset and our in-house DaVinci robotic surgery dataset.
arXiv Detail & Related papers (2021-07-01T05:57:41Z) - Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical
Procedures [70.69948035469467]
We take advantage of the latest computer vision methodologies for generating 3D graphs from camera views.
We then introduce the Multimodal Semantic Graph Scene (MSSG) which aims at providing unified symbolic and semantic representation of surgical procedures.
arXiv Detail & Related papers (2021-06-09T14:35:44Z) - Towards Unsupervised Learning for Instrument Segmentation in Robotic
Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation.
Our approach allows to train image segmentation models without the need to acquire expensive annotations.
We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.