Depth-Driven Geometric Prompt Learning for Laparoscopic Liver Landmark Detection
- URL: http://arxiv.org/abs/2406.17858v2
- Date: Thu, 27 Jun 2024 07:39:05 GMT
- Title: Depth-Driven Geometric Prompt Learning for Laparoscopic Liver Landmark Detection
- Authors: Jialun Pei, Ruize Cui, Yaoqian Li, Weixin Si, Jing Qin, Pheng-Ann Heng,
- Abstract summary: Liver anatomical landmarks serve as important markers for 2D-3D alignment.
To facilitate the detection of laparoscopic liver landmarks, we collect a novel dataset called L3D.
We propose a depth-driven geometric prompt learning network, namely D2GPLand.
- Score: 43.600236988802465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Laparoscopic liver surgery poses a complex intraoperative dynamic environment for surgeons, where remains a significant challenge to distinguish critical or even hidden structures inside the liver. Liver anatomical landmarks, e.g., ridge and ligament, serve as important markers for 2D-3D alignment, which can significantly enhance the spatial perception of surgeons for precise surgery. To facilitate the detection of laparoscopic liver landmarks, we collect a novel dataset called L3D, which comprises 1,152 frames with elaborated landmark annotations from surgical videos of 39 patients across two medical sites. For benchmarking purposes, 12 mainstream detection methods are selected and comprehensively evaluated on L3D. Further, we propose a depth-driven geometric prompt learning network, namely D2GPLand. Specifically, we design a Depth-aware Prompt Embedding (DPE) module that is guided by self-supervised prompts and generates semantically relevant geometric information with the benefit of global depth cues extracted from SAM-based features. Additionally, a Semantic-specific Geometric Augmentation (SGA) scheme is introduced to efficiently merge RGB-D spatial and geometric information through reverse anatomic perception. The experimental results indicate that D2GPLand obtains state-of-the-art performance on L3D, with 63.52% DICE and 48.68% IoU scores. Together with 2D-3D fusion technology, our method can directly provide the surgeon with intuitive guidance information in laparoscopic scenarios.
Related papers
- SLAM assisted 3D tracking system for laparoscopic surgery [22.36252790404779]
This work proposes a real-time monocular 3D tracking algorithm for post-registration tasks.
Experiments from in-vivo and ex-vivo tests demonstrate that the proposed 3D tracking system provides robust 3D tracking.
arXiv Detail & Related papers (2024-09-18T04:00:54Z) - An objective comparison of methods for augmented reality in laparoscopic
liver resection by preoperative-to-intraoperative image fusion [33.12510773034339]
Augmented reality for laparoscopic liver resection is a visualisation mode that allows a surgeon to localise tumours and vessels embedded within the liver by projecting them on top of a laparoscopic image.
Most of the algorithms make use of anatomical landmarks to guide registration.
These landmarks include the liver's inferior ridge, the falciform ligament, and the occluding contours.
We present the Preoperative-to-Intraoperative Laparoscopic Fusion Challenge (P2ILF), which investigates the possibilities of detecting these landmarks automatically and using them in registration.
arXiv Detail & Related papers (2024-01-28T20:30:14Z) - On the Localization of Ultrasound Image Slices within Point Distribution
Models [84.27083443424408]
Thyroid disorders are most commonly diagnosed using high-resolution Ultrasound (US)
Longitudinal tracking is a pivotal diagnostic protocol for monitoring changes in pathological thyroid morphology.
We present a framework for automated US image slice localization within a 3D shape representation.
arXiv Detail & Related papers (2023-09-01T10:10:46Z) - Agent with Tangent-based Formulation and Anatomical Perception for
Standard Plane Localization in 3D Ultrasound [56.7645826576439]
We introduce a novel reinforcement learning framework for automatic SP localization in 3D US.
First, we formulate SP localization in 3D US as a tangent-point-based problem in RL to restructure the action space.
Second, we design an auxiliary task learning strategy to enhance the model's ability to recognize subtle differences crossing Non-SPs and SPs in plane search.
arXiv Detail & Related papers (2022-07-01T14:53:27Z) - A unified 3D framework for Organs at Risk Localization and Segmentation
for Radiation Therapy Planning [56.52933974838905]
Current medical workflow requires manual delineation of organs-at-risk (OAR)
In this work, we aim to introduce a unified 3D pipeline for OAR localization-segmentation.
Our proposed framework fully enables the exploitation of 3D context information inherent in medical imaging.
arXiv Detail & Related papers (2022-03-01T17:08:41Z) - Simulating Realistic MRI variations to Improve Deep Learning model and
visual explanations using GradCAM [0.0]
We use a modified HighRes3DNet model for solving brain MRI volumetric landmark detection problem.
Grad-CAM produces a coarse localization map highlighting the regions the model is focusing.
arXiv Detail & Related papers (2021-11-01T11:14:23Z) - Stereo Dense Scene Reconstruction and Accurate Laparoscope Localization
for Learning-Based Navigation in Robot-Assisted Surgery [37.14020061063255]
The computation of anatomical information and laparoscope position is a fundamental block of robot-assisted surgical navigation in Minimally Invasive Surgery (MIS)
We propose a learning-driven framework, in which an image-guided laparoscopic localization with 3D reconstructions of complex anatomical structures is hereby achieved.
arXiv Detail & Related papers (2021-10-08T06:12:18Z) - 3-Dimensional Deep Learning with Spatial Erasing for Unsupervised
Anomaly Segmentation in Brain MRI [55.97060983868787]
We investigate whether using increased spatial context by using MRI volumes combined with spatial erasing leads to improved unsupervised anomaly segmentation performance.
We compare 2D variational autoencoder (VAE) to their 3D counterpart, propose 3D input erasing, and systemically study the impact of the data set size on the performance.
Our best performing 3D VAE with input erasing leads to an average DICE score of 31.40% compared to 25.76% for the 2D VAE.
arXiv Detail & Related papers (2021-09-14T09:17:27Z) - PAENet: A Progressive Attention-Enhanced Network for 3D to 2D Retinal
Vessel Segmentation [0.0]
3D to 2D retinal vessel segmentation is a challenging problem in Optical Coherence Tomography Angiography ( OCTA) images.
We propose a Progressive Attention-Enhanced Network (PAENet) based on attention mechanisms to extract rich feature representation.
Our proposed algorithm achieves state-of-the-art performance compared with previous methods.
arXiv Detail & Related papers (2021-08-26T10:27:25Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.