Related papers: From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction

From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction

URL: http://arxiv.org/abs/2503.16263v1
Date: Thu, 20 Mar 2025 15:57:18 GMT
Title: From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction
Authors: Ayberk Acar, Mariana Smith, Lidia Al-Zogbi, Tanner Watts, Fangjie Li, Hao Li, Nural Yilmaz, Paul Maria Scheikl, Jesse F. d'Almeida, Susheela Sharma, Lauren Branscombe, Tayfun Efe Ertop, Robert J. Webster III, Ipek Oguz, Alan Kuntz, Axel Krieger, Jie Ying Wu,
Abstract summary: We propose a 3D mapping pipeline that uses only RGB images to create segmented point clouds of the target anatomy.<n>In several metrics, including post-procedure tissue model evaluation, our pipeline performs comparably to RGB-D cameras.<n>This study is a step toward the complete autonomy of surgical robots.
Score: 8.485401463053098
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Surgical automation requires precise guidance and understanding of the scene. Current methods in the literature rely on bulky depth cameras to create maps of the anatomy, however this does not translate well to space-limited clinical applications. Monocular cameras are small and allow minimally invasive surgeries in tight spaces but additional processing is required to generate 3D scene understanding. We propose a 3D mapping pipeline that uses only RGB images to create segmented point clouds of the target anatomy. To ensure the most precise reconstruction, we compare different structure from motion algorithms' performance on mapping the central airway obstructions, and test the pipeline on a downstream task of tumor resection. In several metrics, including post-procedure tissue model evaluation, our pipeline performs comparably to RGB-D cameras and, in some cases, even surpasses their performance. These promising results demonstrate that automation guidance can be achieved in minimally invasive procedures with monocular cameras. This study is a step toward the complete autonomy of surgical robots.

Related papers

Endo3R: Unified Online Reconstruction from Dynamic Monocular Endoscopic Video [35.241054116681426]
Endo3R is a unified 3D foundation model for online scale-consistent reconstruction from monocular surgical video. Our model unifies the tasks by predicting globally aligned pointmaps, scale-consistent video depths, and camera parameters without any offline optimization.
arXiv Detail & Related papers (2025-04-04T06:05:22Z)
SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model [3.2554912675000818]
We introduce SAM3D, a new approach to semi-automatic zero-shot segmentation of 3D images building on the existing Segment Anything Model. We achieve fast and accurate segmentations in 3D images with a four-step strategy involving: user prompting with 3D polylines, volume slicing along multiple axes, slice-wide inference with a pretrained model, and recomposition and refinement in 3D.
arXiv Detail & Related papers (2024-05-10T19:26:17Z)
High-fidelity Endoscopic Image Synthesis by Utilizing Depth-guided Neural Surfaces [18.948630080040576]
We introduce a novel method for colon section reconstruction by leveraging NeuS applied to endoscopic images, supplemented by a single frame of depth map. Our approach demonstrates exceptional accuracy in completely rendering colon sections, even capturing unseen portions of the surface. This breakthrough opens avenues for achieving stable and consistently scaled reconstructions, promising enhanced quality in cancer screening procedures and treatment interventions.
arXiv Detail & Related papers (2024-04-20T18:06:26Z)
Creating a Digital Twin of Spinal Surgery: A Proof of Concept [68.37190859183663]
Surgery digitalization is the process of creating a virtual replica of real-world surgery. We present a proof of concept (PoC) for surgery digitalization that is applied to an ex-vivo spinal surgery. We employ five RGB-D cameras for dynamic 3D reconstruction of the surgeon, a high-end camera for 3D reconstruction of the anatomy, an infrared stereo camera for surgical instrument tracking, and a laser scanner for 3D reconstruction of the operating room and data fusion.
arXiv Detail & Related papers (2024-03-25T13:09:40Z)
Monocular Microscope to CT Registration using Pose Estimation of the Incus for Augmented Reality Cochlear Implant Surgery [3.8909273404657556]
We develop a method that permits direct 2D-to-3D registration of the view microscope video to the pre-operative Computed Tomography (CT) scan without the need for external tracking equipment. Our results demonstrate the accuracy with an average rotation error of less than 25 degrees and a translation error of less than 2 mm, 3 mm, and 0.55% for the x, y, and z axes, respectively.
arXiv Detail & Related papers (2024-03-12T00:26:08Z)
Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose Estimation of Surgical Instruments [66.74633676595889]
We present a multi-camera capture setup consisting of static and head-mounted cameras. Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre. Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z)
Live image-based neurosurgical guidance and roadmap generation using unsupervised embedding [53.992124594124896]
We present a method for live image-only guidance leveraging a large data set of annotated neurosurgical videos. A generated roadmap encodes the common anatomical paths taken in surgeries in the training set. We trained and evaluated the proposed method with a data set of 166 transsphenoidal adenomectomy procedures.
arXiv Detail & Related papers (2023-03-31T12:52:24Z)
Robotic Navigation Autonomy for Subretinal Injection via Intelligent Real-Time Virtual iOCT Volume Slicing [88.99939660183881]
We propose a framework for autonomous robotic navigation for subretinal injection. Our method consists of an instrument pose estimation method, an online registration between the robotic and the i OCT system, and trajectory planning tailored for navigation to an injection target. Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method.
arXiv Detail & Related papers (2023-01-17T21:41:21Z)
Attentive Symmetric Autoencoder for Brain MRI Segmentation [56.02577247523737]
We propose a novel Attentive Symmetric Auto-encoder based on Vision Transformer (ViT) for 3D brain MRI segmentation tasks. In the pre-training stage, the proposed auto-encoder pays more attention to reconstruct the informative patches according to the gradient metrics. Experimental results show that our proposed attentive symmetric auto-encoder outperforms the state-of-the-art self-supervised learning methods and medical image segmentation models.
arXiv Detail & Related papers (2022-09-19T09:43:19Z)
A unified 3D framework for Organs at Risk Localization and Segmentation for Radiation Therapy Planning [56.52933974838905]
Current medical workflow requires manual delineation of organs-at-risk (OAR) In this work, we aim to introduce a unified 3D pipeline for OAR localization-segmentation. Our proposed framework fully enables the exploitation of 3D context information inherent in medical imaging.
arXiv Detail & Related papers (2022-03-01T17:08:41Z)
E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with Transformer-based Stereoscopic Depth Perception [15.927060244702686]
We present an efficient reconstruction pipeline for highly dynamic surgical scenes that runs at 28 fps. Specifically, we design a transformer-based stereoscopic depth perception for efficient depth estimation. We evaluate the proposed pipeline on two datasets, the public Hamlyn Centre Endoscopic Video dataset and our in-house DaVinci robotic surgery dataset.
arXiv Detail & Related papers (2021-07-01T05:57:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.