SMUDLP: Self-Teaching Multi-Frame Unsupervised Endoscopic Depth
Estimation with Learnable Patchmatch
- URL: http://arxiv.org/abs/2205.15034v1
- Date: Mon, 30 May 2022 12:11:03 GMT
- Title: SMUDLP: Self-Teaching Multi-Frame Unsupervised Endoscopic Depth
Estimation with Learnable Patchmatch
- Authors: Shuwei Shao, Zhongcai Pei, Weihai Chen, Xingming Wu, Zhong Liu,
Zhengguo Li
- Abstract summary: Unsupervised monocular trained depth estimation models make use of adjacent frames as a supervisory signal during the training phase.
temporally correlated frames are also available at inference time for many clinical applications, e.g., surgical navigation.
We present SMUDLP, a novel and unsupervised paradigm for multi-frame monocular endoscopic depth estimation.
- Score: 25.35009126980672
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised monocular trained depth estimation models make use of adjacent
frames as a supervisory signal during the training phase. However, temporally
correlated frames are also available at inference time for many clinical
applications, e.g., surgical navigation. The vast majority of monocular systems
do not exploit this valuable signal that could be deployed to enhance the depth
estimates. Those that do, achieve only limited gains due to the unique
challenges in endoscopic scenes, such as low and homogeneous textures and
inter-frame brightness fluctuations. In this work, we present SMUDLP, a novel
and unsupervised paradigm for multi-frame monocular endoscopic depth
estimation. The SMUDLP integrates a learnable patchmatch module to adaptively
increase the discriminative ability in low-texture and homogeneous-texture
regions, and enforces cross-teaching and self-teaching consistencies to provide
efficacious regularizations towards brightness fluctuations. Our detailed
experiments on both SCARED and Hamlyn datasets indicate that the SMUDLP exceeds
state-of-the-art competitors by a large margin, including those that use single
or multiple frames at inference time. The source code and trained models will
be publicly available upon the acceptance.
Related papers
- Federated Learning for Coronary Artery Plaque Detection in Atherosclerosis Using IVUS Imaging: A Multi-Hospital Collaboration [8.358846277772779]
Traditional interpretation of Intravascular Ultrasound (IVUS) images during Percutaneous Coronary Intervention ( PCI) is time-intensive and inconsistent.
A parallel 2D U-Net model with a multi-stage segmentation architecture has been developed to enable secure data analysis across institutions.
A Dice Similarity Coefficient (DSC) of 0.706, the model effectively identifies plaques and detects circular boundaries in real-time.
arXiv Detail & Related papers (2024-12-19T13:06:28Z) - PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - OCAI: Improving Optical Flow Estimation by Occlusion and Consistency Aware Interpolation [55.676358801492114]
We propose OCAI, a method that supports robust frame ambiguities by generating intermediate video frames alongside optical flows in between.
Our evaluations demonstrate superior quality and enhanced optical flow accuracy on established benchmarks such as Sintel and KITTI.
arXiv Detail & Related papers (2024-03-26T20:23:48Z) - Self-STORM: Deep Unrolled Self-Supervised Learning for Super-Resolution Microscopy [55.2480439325792]
We introduce deep unrolled self-supervised learning, which alleviates the need for such data by training a sequence-specific, model-based autoencoder.
Our proposed method exceeds the performance of its supervised counterparts.
arXiv Detail & Related papers (2024-03-25T17:40:32Z) - EndoDepthL: Lightweight Endoscopic Monocular Depth Estimation with
CNN-Transformer [0.0]
We propose a novel lightweight solution named EndoDepthL that integrates CNN and Transformers to predict multi-scale depth maps.
Our approach includes optimizing the network architecture, incorporating multi-scale dilated convolution, and a multi-channel attention mechanism.
To better evaluate the performance of monocular depth estimation in endoscopic imaging, we propose a novel complexity evaluation metric.
arXiv Detail & Related papers (2023-08-04T21:38:29Z) - Bilevel Fast Scene Adaptation for Low-Light Image Enhancement [50.639332885989255]
Enhancing images in low-light scenes is a challenging but widely concerned task in the computer vision.
Main obstacle lies in the modeling conundrum from distribution discrepancy across different scenes.
We introduce the bilevel paradigm to model the above latent correspondence.
A bilevel learning framework is constructed to endow the scene-irrelevant generality of the encoder towards diverse scenes.
arXiv Detail & Related papers (2023-06-02T08:16:21Z) - Self-Supervised Monocular Depth and Ego-Motion Estimation in Endoscopy:
Appearance Flow to the Rescue [38.168759071532676]
Self-supervised learning technology has been applied to calculate depth and ego-motion from monocular videos.
In this work, we introduce a novel concept referred to as appearance flow to address the brightness inconsistency problem.
We build a unified self-supervised framework to estimate monocular depth and ego-motion simultaneously in endoscopic scenes.
arXiv Detail & Related papers (2021-12-15T13:51:10Z) - Incremental Cross-Domain Adaptation for Robust Retinopathy Screening via
Bayesian Deep Learning [7.535751594024775]
Retinopathy represents a group of retinal diseases that, if not treated timely, can cause severe visual impairments or even blindness.
This paper presents a novel incremental cross-domain adaptation instrument that allows any deep classification model to progressively learn abnormal retinal pathologies.
The proposed framework, evaluated on six public datasets, outperforms the state-of-the-art competitors by achieving an overall accuracy and F1 score of 0.9826 and 0.9846, respectively.
arXiv Detail & Related papers (2021-10-18T13:45:21Z) - Unsupervised Scale-consistent Depth Learning from Video [131.3074342883371]
We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training.
Thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system.
The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training.
arXiv Detail & Related papers (2021-05-25T02:17:56Z) - Multi-Disease Detection in Retinal Imaging based on Ensembling
Heterogeneous Deep Learning Models [0.0]
We propose an innovative multi-disease detection pipeline for retinal imaging.
Our pipeline includes state-of-the-art strategies like transfer learning, class weighting, real-time image augmentation and Focal loss utilization.
arXiv Detail & Related papers (2021-03-26T18:02:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.