Efficient Domain Adaptation for Endoscopic Visual Odometry
- URL: http://arxiv.org/abs/2403.10860v1
- Date: Sat, 16 Mar 2024 08:57:00 GMT
- Title: Efficient Domain Adaptation for Endoscopic Visual Odometry
- Authors: Junyang Wu, Yun Gu, Guang-Zhong Yang,
- Abstract summary: Domain adaptation offers a promising approach to bridge the pre-operative planning domain with the intra-operative real domain for learning odometry information.
In this work, an efficient neural style transfer framework for endoscopic visual odometry is proposed, which compresses the time from pre-operative planning to testing phase to less than five minutes.
- Score: 28.802915155343964
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual odometry plays a crucial role in endoscopic imaging, yet the scarcity of realistic images with ground truth poses poses a significant challenge. Therefore, domain adaptation offers a promising approach to bridge the pre-operative planning domain with the intra-operative real domain for learning odometry information. However, existing methodologies suffer from inefficiencies in the training time. In this work, an efficient neural style transfer framework for endoscopic visual odometry is proposed, which compresses the time from pre-operative planning to testing phase to less than five minutes. For efficient traing, this work focuses on training modules with only a limited number of real images and we exploit pre-operative prior information to dramatically reduce training duration. Moreover, during the testing phase, we propose a novel Test Time Adaptation (TTA) method to mitigate the gap in lighting conditions between training and testing datasets. Experimental evaluations conducted on two public endoscope datasets showcase that our method achieves state-of-the-art accuracy in visual odometry tasks while boasting the fastest training speeds. These results demonstrate significant promise for intra-operative surgery applications.
Related papers
- Data-Centric Learning Framework for Real-Time Detection of Aiming Beam in Fluorescence Lifetime Imaging Guided Surgery [3.8261910636994925]
This study introduces a novel data-centric approach to improve real-time surgical guidance using fiber-based fluorescence lifetime imaging (FLIm)
The primary challenge arises from the complex and variable conditions encountered in the surgical environment, particularly in Transoral Robotic Surgery (TORS)
An instance segmentation model was developed using a data-centric training strategy that improves accuracy by minimizing label noise and enhancing detection robustness.
arXiv Detail & Related papers (2024-11-11T22:04:32Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Transferring Relative Monocular Depth to Surgical Vision with Temporal Consistency [3.585363618435449]
Relative monocular depth, inferring depth up to shift and scale from a single image, is an active research topic.
Recent deep learning models, trained on large and varied meta-datasets, now provide excellent performance in the domain of natural images.
Few datasets exist which provide ground truth depth for endoscopic images, making training such models from scratch unfeasible.
arXiv Detail & Related papers (2024-03-11T12:57:51Z) - Efficient Deformable Tissue Reconstruction via Orthogonal Neural Plane [58.871015937204255]
We introduce Fast Orthogonal Plane (plane) for the reconstruction of deformable tissues.
We conceptualize surgical procedures as 4D volumes, and break them down into static and dynamic fields comprised of neural planes.
This factorization iscretizes four-dimensional space, leading to a decreased memory usage and faster optimization.
arXiv Detail & Related papers (2023-12-23T13:27:50Z) - Self-Supervised-RCNN for Medical Image Segmentation with Limited Data
Annotation [0.16490701092527607]
We propose an alternative deep learning training strategy based on self-supervised pretraining on unlabeled MRI scans.
Our pretraining approach first, randomly applies different distortions to random areas of unlabeled images and then predicts the type of distortions and loss of information.
The effectiveness of the proposed method for segmentation tasks in different pre-training and fine-tuning scenarios is evaluated.
arXiv Detail & Related papers (2022-07-17T13:28:52Z) - DLTTA: Dynamic Learning Rate for Test-time Adaptation on Cross-domain
Medical Images [56.72015587067494]
We propose a novel dynamic learning rate adjustment method for test-time adaptation, called DLTTA.
Our method achieves effective and fast test-time adaptation with consistent performance improvement over current state-of-the-art test-time adaptation methods.
arXiv Detail & Related papers (2022-05-27T02:34:32Z) - A Temporal Learning Approach to Inpainting Endoscopic Specularities and
Its effect on Image Correspondence [13.25903945009516]
We propose using a temporal generative adversarial network (GAN) to inpaint the hidden anatomy under specularities.
This is achieved using in-vivo data of gastric endoscopy (Hyper-Kvasir) in a fully unsupervised manner.
We also assess the effect of our method in computer vision tasks that underpin 3D reconstruction and camera motion estimation.
arXiv Detail & Related papers (2022-03-31T13:14:00Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - Adversarial Domain Feature Adaptation for Bronchoscopic Depth Estimation [111.89519571205778]
In this work, we propose an alternative domain-adaptive approach to depth estimation.
Our novel two-step structure first trains a depth estimation network with labeled synthetic images in a supervised manner.
The results of our experiments show that the proposed method improves the network's performance on real images by a considerable margin.
arXiv Detail & Related papers (2021-09-24T08:11:34Z) - Self-Supervised Learning from Unlabeled Fundus Photographs Improves
Segmentation of the Retina [4.815051667870375]
Fundus photography is the primary method for retinal imaging and essential for diabetic retinopathy prevention.
Current segmentation methods are not robust towards the diversity in imaging conditions and pathologies typical for real-world clinical applications.
We utilize contrastive self-supervised learning to exploit the large variety of unlabeled fundus images in the publicly available EyePACS dataset.
arXiv Detail & Related papers (2021-08-05T18:02:56Z) - Searching for Efficient Architecture for Instrument Segmentation in
Robotic Surgery [58.63306322525082]
Most applications rely on accurate real-time segmentation of high-resolution surgical images.
We design a light-weight and highly-efficient deep residual architecture which is tuned to perform real-time inference of high-resolution images.
arXiv Detail & Related papers (2020-07-08T21:38:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.