Continual Learning of Unsupervised Monocular Depth from Videos
- URL: http://arxiv.org/abs/2311.02393v1
- Date: Sat, 4 Nov 2023 12:36:07 GMT
- Title: Continual Learning of Unsupervised Monocular Depth from Videos
- Authors: Hemang Chawla, Arnav Varma, Elahe Arani, and Bahram Zonooz
- Abstract summary: We introduce a framework that captures challenges of continual unsupervised depth estimation (CUDE)
We propose a rehearsal-based dual-memory method, MonoDepthCL, which utilizes collected ontemporal consistency for continual learning in depth estimation.
- Score: 19.43053045216986
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Spatial scene understanding, including monocular depth estimation, is an
important problem in various applications, such as robotics and autonomous
driving. While improvements in unsupervised monocular depth estimation have
potentially allowed models to be trained on diverse crowdsourced videos, this
remains underexplored as most methods utilize the standard training protocol,
wherein the models are trained from scratch on all data after new data is
collected. Instead, continual training of models on sequentially collected data
would significantly reduce computational and memory costs. Nevertheless, naive
continual training leads to catastrophic forgetting, where the model
performance deteriorates on older domains as it learns on newer domains,
highlighting the trade-off between model stability and plasticity. While
several techniques have been proposed to address this issue in image
classification, the high-dimensional and spatiotemporally correlated outputs of
depth estimation make it a distinct challenge. To the best of our knowledge, no
framework or method currently exists focusing on the problem of continual
learning in depth estimation. Thus, we introduce a framework that captures the
challenges of continual unsupervised depth estimation (CUDE), and define the
necessary metrics to evaluate model performance. We propose a rehearsal-based
dual-memory method, MonoDepthCL, which utilizes spatiotemporal consistency for
continual learning in depth estimation, even when the camera intrinsics are
unknown.
Related papers
- UnCLe: Unsupervised Continual Learning of Depth Completion [5.677777151863184]
UnCLe is a standardized benchmark for Unsupervised Continual Learning of a multimodal depth estimation task.
We benchmark depth completion models under the practical scenario of unsupervised learning over continuous streams of data.
arXiv Detail & Related papers (2024-10-23T17:56:33Z) - Temporal-Difference Variational Continual Learning [89.32940051152782]
A crucial capability of Machine Learning models in real-world applications is the ability to continuously learn new tasks.
In Continual Learning settings, models often struggle to balance learning new tasks with retaining previous knowledge.
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.
We propose a novel approach to address this issue at test time without requiring retraining.
MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z) - Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation [33.140210057065644]
This paper introduces a novel approach named Stealing Stable Diffusion (SSD) prior for robust monocular depth estimation.
The approach addresses this limitation by utilizing stable diffusion to generate synthetic images that mimic challenging conditions.
The effectiveness of the approach is evaluated on nuScenes and Oxford RobotCar, two challenging public datasets.
arXiv Detail & Related papers (2024-03-08T05:06:31Z) - MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation [21.32581390211547]
Motion-Aware Loss is a novel, plug-and-play module designed for seamless integration into multi-frame self-supervised monocular depth estimation methods.
MAL leads to a reduction in depth estimation errors by up to 4.2% and 10.8% on KITTI and CityScapes benchmarks, respectively.
arXiv Detail & Related papers (2024-02-18T08:34:15Z) - Robust Geometry-Preserving Depth Estimation Using Differentiable
Rendering [93.94371335579321]
We propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.
Comprehensive experiments underscore our framework's superior generalization capabilities.
Our innovative loss functions empower the model to autonomously recover domain-specific scale-and-shift coefficients.
arXiv Detail & Related papers (2023-09-18T12:36:39Z) - New metrics for analyzing continual learners [27.868967961503962]
Continual Learning (CL) poses challenges to standard learning algorithms.
This stability-plasticity dilemma remains central to CL and multiple metrics have been proposed to adequately measure stability and plasticity separately.
We propose new metrics that account for the task's increasing difficulty.
arXiv Detail & Related papers (2023-09-01T13:53:33Z) - SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for
Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes.
It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions.
We introduce an external pretrained monocular depth estimation model for generating single-image depth prior.
Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z) - Learn to Adapt for Monocular Depth Estimation [17.887575611570394]
We propose an adversarial depth estimation task and train the model in the pipeline of meta-learning.
Our method adapts well to new datasets after few training steps during the test procedure.
arXiv Detail & Related papers (2022-03-26T06:49:22Z) - Occlusion-Aware Self-Supervised Monocular 6D Object Pose Estimation [88.8963330073454]
We propose a novel monocular 6D pose estimation approach by means of self-supervised learning.
We leverage current trends in noisy student training and differentiable rendering to further self-supervise the model.
Our proposed self-supervision outperforms all other methods relying on synthetic data.
arXiv Detail & Related papers (2022-03-19T15:12:06Z) - Unsupervised Monocular Depth Learning with Integrated Intrinsics and
Spatio-Temporal Constraints [61.46323213702369]
This work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion.
Our results demonstrate strong performance when compared to the current state-of-the-art on multiple sequences of the KITTI driving dataset.
arXiv Detail & Related papers (2020-11-02T22:26:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.