Self-Supervised Monocular Depth Estimation with Self-Reference
Distillation and Disparity Offset Refinement
- URL: http://arxiv.org/abs/2302.09789v2
- Date: Thu, 15 Jun 2023 08:35:44 GMT
- Title: Self-Supervised Monocular Depth Estimation with Self-Reference
Distillation and Disparity Offset Refinement
- Authors: Zhong Liu, Ran Li, Shuwei Shao, Xingming Wu and Weihai Chen
- Abstract summary: We propose two novel ideas to improve self-supervised monocular depth estimation.
We use a parameter-optimized model as the teacher updated as the training epochs to provide additional supervision.
We leverage the contextual consistency between high-scale and low-scale features to obtain multiscale disparity offsets.
- Score: 15.012694052674899
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Monocular depth estimation plays a fundamental role in computer vision. Due
to the costly acquisition of depth ground truth, self-supervised methods that
leverage adjacent frames to establish a supervisory signal have emerged as the
most promising paradigms. In this work, we propose two novel ideas to improve
self-supervised monocular depth estimation: 1) self-reference distillation and
2) disparity offset refinement. Specifically, we use a parameter-optimized
model as the teacher updated as the training epochs to provide additional
supervision during the training process. The teacher model has the same
structure as the student model, with weights inherited from the historical
student model. In addition, a multiview check is introduced to filter out the
outliers produced by the teacher model. Furthermore, we leverage the contextual
consistency between high-scale and low-scale features to obtain multiscale
disparity offsets, which are used to refine the disparity output incrementally
by aligning disparity information at different scales. The experimental results
on the KITTI and Make3D datasets show that our method outperforms previous
state-of-the-art competitors.
Related papers
- Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation [33.140210057065644]
This paper introduces a novel approach named Stealing Stable Diffusion (SSD) prior for robust monocular depth estimation.
The approach addresses this limitation by utilizing stable diffusion to generate synthetic images that mimic challenging conditions.
The effectiveness of the approach is evaluated on nuScenes and Oxford RobotCar, two challenging public datasets.
arXiv Detail & Related papers (2024-03-08T05:06:31Z) - DMT: Comprehensive Distillation with Multiple Self-supervised Teachers [27.037140667247208]
We introduce Comprehensive Distillation with Multiple Self-supervised Teachers (DMT) for pretrained model compression.
Our experimental results on prominent benchmark datasets exhibit that the proposed method significantly surpasses state-of-the-art competitors.
arXiv Detail & Related papers (2023-12-19T08:31:30Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - Sparse Depth-Guided Attention for Accurate Depth Completion: A
Stereo-Assisted Monitored Distillation Approach [7.902840502973506]
We introduce a stereo-based model as a teacher model to improve the accuracy of the student model for depth completion.
To provide self-supervised information, we also employ multi-view depth consistency and multi-scale minimum reprojection.
arXiv Detail & Related papers (2023-03-28T09:23:19Z) - Ensemble knowledge distillation of self-supervised speech models [84.69577440755457]
Distilled self-supervised models have shown competitive performance and efficiency in recent years.
We performed Ensemble Knowledge Distillation (EKD) on various self-supervised speech models such as HuBERT, RobustHuBERT, and WavLM.
Our method improves the performance of the distilled models on four downstream speech processing tasks.
arXiv Detail & Related papers (2023-02-24T17:15:39Z) - EmbedDistill: A Geometric Knowledge Distillation for Information
Retrieval [83.79667141681418]
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR)
We propose a novel distillation approach that leverages the relative geometry among queries and documents learned by the large teacher model.
We show that our approach successfully distills from both dual-encoder (DE) and cross-encoder (CE) teacher models to 1/10th size asymmetric students that can retain 95-97% of the teacher performance.
arXiv Detail & Related papers (2023-01-27T22:04:37Z) - Self-Supervised Models are Continual Learners [79.70541692930108]
We show that self-supervised loss functions can be seamlessly converted into distillation mechanisms for Continual Learning.
We devise a framework for Continual self-supervised visual representation Learning that significantly improves the quality of the learned representations.
arXiv Detail & Related papers (2021-12-08T10:39:13Z) - Learning by Distillation: A Self-Supervised Learning Framework for
Optical Flow Estimation [71.76008290101214]
DistillFlow is a knowledge distillation approach to learning optical flow.
It achieves state-of-the-art unsupervised learning performance on both KITTI and Sintel datasets.
Our models ranked 1st among all monocular methods on the KITTI 2015 benchmark, and outperform all published methods on the Sintel Final benchmark.
arXiv Detail & Related papers (2021-06-08T09:13:34Z) - Distill on the Go: Online knowledge distillation in self-supervised
learning [1.1470070927586016]
Recent works have shown that wider and deeper models benefit more from self-supervised learning than smaller models.
We propose Distill-on-the-Go (DoGo), a self-supervised learning paradigm using single-stage online knowledge distillation.
Our results show significant performance gain in the presence of noisy and limited labels.
arXiv Detail & Related papers (2021-04-20T09:59:23Z) - Knowledge Distillation Meets Self-Supervision [109.6400639148393]
Knowledge distillation involves extracting "dark knowledge" from a teacher network to guide the learning of a student network.
We show that the seemingly different self-supervision task can serve as a simple yet powerful solution.
By exploiting the similarity between those self-supervision signals as an auxiliary task, one can effectively transfer the hidden information from the teacher to the student.
arXiv Detail & Related papers (2020-06-12T12:18:52Z) - Self-supervised Monocular Trained Depth Estimation using Self-attention
and Discrete Disparity Volume [19.785343302320918]
We propose two new ideas to improve self-supervised monocular trained depth estimation: 1) self-attention, and 2) discrete disparity prediction.
We show that the extension of the state-of-the-art self-supervised monocular trained depth estimator Monodepth2 with these two ideas allows us to design a model that produces the best results in the field in KITTI 2015 and Make3D.
arXiv Detail & Related papers (2020-03-31T04:48:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.