Excavating the Potential Capacity of Self-Supervised Monocular Depth
Estimation
- URL: http://arxiv.org/abs/2109.12484v1
- Date: Sun, 26 Sep 2021 03:40:56 GMT
- Title: Excavating the Potential Capacity of Self-Supervised Monocular Depth
Estimation
- Authors: Rui Peng, Ronggang Wang, Yawen Lai, Luyang Tang, Yangang Cai
- Abstract summary: We show that the potential capacity of self-supervised monocular depth estimation can be excavated without increasing this cost.
Our contributions can bring significant performance improvement to the baseline with even less computational overhead.
Our model, named EPCDepth, surpasses the previous state-of-the-art methods even those supervised by additional constraints.
- Score: 10.620856690388376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised methods play an increasingly important role in monocular
depth estimation due to their great potential and low annotation cost. To close
the gap with supervised methods, recent works take advantage of extra
constraints, e.g., semantic segmentation. However, these methods will
inevitably increase the burden on the model. In this paper, we show theoretical
and empirical evidence that the potential capacity of self-supervised monocular
depth estimation can be excavated without increasing this cost. In particular,
we propose (1) a novel data augmentation approach called data grafting, which
forces the model to explore more cues to infer depth besides the vertical image
position, (2) an exploratory self-distillation loss, which is supervised by the
self-distillation label generated by our new post-processing method - selective
post-processing, and (3) the full-scale network, designed to endow the encoder
with the specialization of depth estimation task and enhance the
representational power of the model. Extensive experiments show that our
contributions can bring significant performance improvement to the baseline
with even less computational overhead, and our model, named EPCDepth, surpasses
the previous state-of-the-art methods even those supervised by additional
constraints.
Related papers
- GroCo: Ground Constraint for Metric Self-Supervised Monocular Depth [2.805351469151152]
We propose a novel constraint on ground areas designed specifically for the self-supervised paradigm.
This mechanism not only allows to accurately recover the scale but also ensures coherence between the depth prediction and the ground prior.
arXiv Detail & Related papers (2024-09-23T09:30:27Z) - Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think [53.2706196341054]
We show that the perceived inefficiency was caused by a flaw in the inference pipeline that has so far gone unnoticed.
We perform end-to-end fine-tuning on top of the single-step model with task-specific losses and get a deterministic model that outperforms all other diffusion-based depth and normal estimation models.
arXiv Detail & Related papers (2024-09-17T16:58:52Z) - Unsupervised Monocular Depth Estimation Based on Hierarchical Feature-Guided Diffusion [21.939618694037108]
Unsupervised monocular depth estimation has received widespread attention because of its capability to train without ground truth.
We employ a well-converging diffusion model among generative networks for unsupervised monocular depth estimation.
This model significantly enriches the model's capacity for learning and interpreting depth distribution.
arXiv Detail & Related papers (2024-06-14T07:31:20Z) - Self-Supervised Monocular Depth Estimation with Self-Reference
Distillation and Disparity Offset Refinement [15.012694052674899]
We propose two novel ideas to improve self-supervised monocular depth estimation.
We use a parameter-optimized model as the teacher updated as the training epochs to provide additional supervision.
We leverage the contextual consistency between high-scale and low-scale features to obtain multiscale disparity offsets.
arXiv Detail & Related papers (2023-02-20T06:28:52Z) - FG-Depth: Flow-Guided Unsupervised Monocular Depth Estimation [17.572459787107427]
We propose a flow distillation loss to replace the typical photometric loss and a prior flow based mask to remove invalid pixels.
Our approach achieves state-of-the-art results on both KITTI and NYU-Depth-v2 datasets.
arXiv Detail & Related papers (2023-01-20T04:02:13Z) - SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for
Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes.
It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions.
We introduce an external pretrained monocular depth estimation model for generating single-image depth prior.
Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z) - Pseudo Supervised Monocular Depth Estimation with Teacher-Student
Network [90.20878165546361]
We propose a new unsupervised depth estimation method based on pseudo supervision mechanism.
It strategically integrates the advantages of supervised and unsupervised monocular depth estimation.
Our experimental results demonstrate that the proposed method outperforms the state-of-the-art on the KITTI benchmark.
arXiv Detail & Related papers (2021-10-22T01:08:36Z) - High-Dimensional Bayesian Optimisation with Variational Autoencoders and
Deep Metric Learning [119.91679702854499]
We introduce a method based on deep metric learning to perform Bayesian optimisation over high-dimensional, structured input spaces.
We achieve such an inductive bias using just 1% of the available labelled data.
As an empirical contribution, we present state-of-the-art results on real-world high-dimensional black-box optimisation problems.
arXiv Detail & Related papers (2021-06-07T13:35:47Z) - Calibrating Self-supervised Monocular Depth Estimation [77.77696851397539]
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal.
We show that incorporating prior information about the camera configuration and the environment, we can remove the scale ambiguity and predict depth directly, still using the self-supervised formulation and not relying on any additional sensors.
arXiv Detail & Related papers (2020-09-16T14:35:45Z) - On the uncertainty of self-supervised monocular depth estimation [52.13311094743952]
Self-supervised paradigms for monocular depth estimation are very appealing since they do not require ground truth annotations at all.
We explore for the first time how to estimate the uncertainty for this task and how this affects depth accuracy.
We propose a novel peculiar technique specifically designed for self-supervised approaches.
arXiv Detail & Related papers (2020-05-13T09:00:55Z) - Self-supervised Monocular Trained Depth Estimation using Self-attention
and Discrete Disparity Volume [19.785343302320918]
We propose two new ideas to improve self-supervised monocular trained depth estimation: 1) self-attention, and 2) discrete disparity prediction.
We show that the extension of the state-of-the-art self-supervised monocular trained depth estimator Monodepth2 with these two ideas allows us to design a model that produces the best results in the field in KITTI 2015 and Make3D.
arXiv Detail & Related papers (2020-03-31T04:48:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.