Learnable Patchmatch and Self-Teaching for Multi-Frame Depth Estimation in Monocular Endoscopy
- URL: http://arxiv.org/abs/2205.15034v2
- Date: Sat, 15 Feb 2025 11:03:51 GMT
- Title: Learnable Patchmatch and Self-Teaching for Multi-Frame Depth Estimation in Monocular Endoscopy
- Authors: Shuwei Shao, Zhongcai Pei, Weihai Chen, Xingming Wu, Zhong Liu,
- Abstract summary: We propose a novel unsupervised multi-frame monocular depth estimation model.
The proposed model integrates a learnable patchmatch module to adaptively increase the discriminative ability in regions with low and homogeneous textures.
As a byproduct of the self-teaching paradigm, the proposed model is able to improve the depth predictions when more frames are input at test time.
- Score: 16.233423010425355
- License:
- Abstract: This work delves into unsupervised monocular depth estimation in endoscopy, which leverages adjacent frames to establish a supervisory signal during the training phase. For many clinical applications, e.g., surgical navigation, temporally correlated frames are also available at test time. Due to the lack of depth clues, making full use of the temporal correlation among multiple video frames at both phases is crucial for accurate depth estimation. However, several challenges in endoscopic scenes, such as low and homogeneous textures and inter-frame brightness fluctuations, limit the performance gain from the temporal correlation. To fully exploit it, we propose a novel unsupervised multi-frame monocular depth estimation model. The proposed model integrates a learnable patchmatch module to adaptively increase the discriminative ability in regions with low and homogeneous textures, and enforces cross-teaching and self-teaching consistencies to provide efficacious regularizations towards brightness fluctuations. Furthermore, as a byproduct of the self-teaching paradigm, the proposed model is able to improve the depth predictions when more frames are input at test time. We conduct detailed experiments on multiple datasets, including SCARED, EndoSLAM, Hamlyn and SERV-CT. The experimental results indicate that our model exceeds the state-of-the-art competitors. The source code and trained models will be publicly available upon the acceptance.
Related papers
- Federated Learning for Coronary Artery Plaque Detection in Atherosclerosis Using IVUS Imaging: A Multi-Hospital Collaboration [8.358846277772779]
Traditional interpretation of Intravascular Ultrasound (IVUS) images during Percutaneous Coronary Intervention ( PCI) is time-intensive and inconsistent.
A parallel 2D U-Net model with a multi-stage segmentation architecture has been developed to enable secure data analysis across institutions.
A Dice Similarity Coefficient (DSC) of 0.706, the model effectively identifies plaques and detects circular boundaries in real-time.
arXiv Detail & Related papers (2024-12-19T13:06:28Z) - PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - OCAI: Improving Optical Flow Estimation by Occlusion and Consistency Aware Interpolation [55.676358801492114]
We propose OCAI, a method that supports robust frame ambiguities by generating intermediate video frames alongside optical flows in between.
Our evaluations demonstrate superior quality and enhanced optical flow accuracy on established benchmarks such as Sintel and KITTI.
arXiv Detail & Related papers (2024-03-26T20:23:48Z) - Self-STORM: Deep Unrolled Self-Supervised Learning for Super-Resolution Microscopy [55.2480439325792]
We introduce deep unrolled self-supervised learning, which alleviates the need for such data by training a sequence-specific, model-based autoencoder.
Our proposed method exceeds the performance of its supervised counterparts.
arXiv Detail & Related papers (2024-03-25T17:40:32Z) - EndoDepthL: Lightweight Endoscopic Monocular Depth Estimation with
CNN-Transformer [0.0]
We propose a novel lightweight solution named EndoDepthL that integrates CNN and Transformers to predict multi-scale depth maps.
Our approach includes optimizing the network architecture, incorporating multi-scale dilated convolution, and a multi-channel attention mechanism.
To better evaluate the performance of monocular depth estimation in endoscopic imaging, we propose a novel complexity evaluation metric.
arXiv Detail & Related papers (2023-08-04T21:38:29Z) - Bilevel Fast Scene Adaptation for Low-Light Image Enhancement [50.639332885989255]
Enhancing images in low-light scenes is a challenging but widely concerned task in the computer vision.
Main obstacle lies in the modeling conundrum from distribution discrepancy across different scenes.
We introduce the bilevel paradigm to model the above latent correspondence.
A bilevel learning framework is constructed to endow the scene-irrelevant generality of the encoder towards diverse scenes.
arXiv Detail & Related papers (2023-06-02T08:16:21Z) - Self-Supervised Monocular Depth and Ego-Motion Estimation in Endoscopy:
Appearance Flow to the Rescue [38.168759071532676]
Self-supervised learning technology has been applied to calculate depth and ego-motion from monocular videos.
In this work, we introduce a novel concept referred to as appearance flow to address the brightness inconsistency problem.
We build a unified self-supervised framework to estimate monocular depth and ego-motion simultaneously in endoscopic scenes.
arXiv Detail & Related papers (2021-12-15T13:51:10Z) - Incremental Cross-Domain Adaptation for Robust Retinopathy Screening via
Bayesian Deep Learning [7.535751594024775]
Retinopathy represents a group of retinal diseases that, if not treated timely, can cause severe visual impairments or even blindness.
This paper presents a novel incremental cross-domain adaptation instrument that allows any deep classification model to progressively learn abnormal retinal pathologies.
The proposed framework, evaluated on six public datasets, outperforms the state-of-the-art competitors by achieving an overall accuracy and F1 score of 0.9826 and 0.9846, respectively.
arXiv Detail & Related papers (2021-10-18T13:45:21Z) - Unsupervised Scale-consistent Depth Learning from Video [131.3074342883371]
We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training.
Thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system.
The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training.
arXiv Detail & Related papers (2021-05-25T02:17:56Z) - Multi-Disease Detection in Retinal Imaging based on Ensembling
Heterogeneous Deep Learning Models [0.0]
We propose an innovative multi-disease detection pipeline for retinal imaging.
Our pipeline includes state-of-the-art strategies like transfer learning, class weighting, real-time image augmentation and Focal loss utilization.
arXiv Detail & Related papers (2021-03-26T18:02:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.