Related papers: MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation

MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation

URL: http://arxiv.org/abs/2402.11507v1
Date: Sun, 18 Feb 2024 08:34:15 GMT
Title: MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation
Authors: Yup-Jiang Dong, Fang-Lue Zhang, Song-Hai Zhang
Abstract summary: Motion-Aware Loss is a novel, plug-and-play module designed for seamless integration into multi-frame self-supervised monocular depth estimation methods. MAL leads to a reduction in depth estimation errors by up to 4.2% and 10.8% on KITTI and CityScapes benchmarks, respectively.
Score: 23.968077662301322
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Depth perception is crucial for a wide range of robotic applications. Multi-frame self-supervised depth estimation methods have gained research interest due to their ability to leverage large-scale, unlabeled real-world data. However, the self-supervised methods often rely on the assumption of a static scene and their performance tends to degrade in dynamic environments. To address this issue, we present Motion-Aware Loss, which leverages the temporal relation among consecutive input frames and a novel distillation scheme between the teacher and student networks in the multi-frame self-supervised depth estimation methods. Specifically, we associate the spatial locations of moving objects with the temporal order of input frames to eliminate errors induced by object motion. Meanwhile, we enhance the original distillation scheme in multi-frame methods to better exploit the knowledge from a teacher network. MAL is a novel, plug-and-play module designed for seamless integration into multi-frame self-supervised monocular depth estimation methods. Adding MAL into previous state-of-the-art methods leads to a reduction in depth estimation errors by up to 4.2% and 10.8% on KITTI and CityScapes benchmarks, respectively.

Related papers

Always Clear Depth: Robust Monocular Depth Estimation under Adverse Weather [48.65180004211851]
We present a robust monocular depth estimation method called textbfACDepth from the perspective of high-quality training data generation and domain adaptation.<n>Specifically, we introduce a one-step diffusion model for generating samples that simulate adverse weather conditions, constructing a multi-tuple degradation dataset during training.<n>We elaborate on a multi-granularity knowledge distillation strategy (MKD) that encourages the student network to absorb knowledge from both the teacher model and pretrained Depth Anything V2.
arXiv Detail & Related papers (2025-05-18T02:30:47Z)
Seurat: From Moving Points to Depth [66.65189052568209]
We propose a novel method that infers relative depth by examining the spatial relationships and temporal evolution of a set of tracked 2D trajectories. Our approach achieves temporally smooth, high-accuracy depth predictions across diverse domains.
arXiv Detail & Related papers (2025-04-20T17:37:02Z)
D$^3$epth: Self-Supervised Depth Estimation with Dynamic Mask in Dynamic Scenes [23.731667977542454]
D$3$epth is a novel method for self-supervised depth estimation in dynamic scenes. It tackles the challenge of dynamic objects from two key perspectives. It consistently outperforms existing self-supervised monocular depth estimation baselines.
arXiv Detail & Related papers (2024-11-07T16:07:00Z)
SM4Depth: Seamless Monocular Metric Depth Estimation across Multiple Cameras and Scenes by One Model [72.0795843450604]
Current approaches face challenges in maintaining consistent accuracy across diverse scenes. These methods rely on extensive datasets comprising millions, if not tens of millions, of data for training. This paper presents SM$4$Depth, a model that seamlessly works for both indoor and outdoor scenes.
arXiv Detail & Related papers (2024-03-13T14:08:25Z)
Continual Learning of Unsupervised Monocular Depth from Videos [19.43053045216986]
We introduce a framework that captures challenges of continual unsupervised depth estimation (CUDE) We propose a rehearsal-based dual-memory method, MonoDepthCL, which utilizes collected ontemporal consistency for continual learning in depth estimation.
arXiv Detail & Related papers (2023-11-04T12:36:07Z)
Dyna-DepthFormer: Multi-frame Transformer for Self-Supervised Depth Estimation in Dynamic Scenes [19.810725397641406]
We propose a novel Dyna-Depthformer framework, which predicts scene depth and 3D motion field jointly. Our contributions are two-fold. First, we leverage multi-view correlation through a series of self- and cross-attention layers in order to obtain enhanced depth feature representation. Second, we propose a warping-based Motion Network to estimate the motion field of dynamic objects without using semantic prior.
arXiv Detail & Related papers (2023-01-14T09:43:23Z)
SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes. It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions. We introduce an external pretrained monocular depth estimation model for generating single-image depth prior. Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z)
Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics [74.1720528573331]
Unsupervised monocular depth and ego-motion estimation has drawn extensive research attention in recent years. We propose DynaDepth, a novel scale-aware framework that integrates information from vision and IMU motion dynamics. We validate the effectiveness of DynaDepth by conducting extensive experiments and simulations on the KITTI and Make3D datasets.
arXiv Detail & Related papers (2022-07-11T07:50:22Z)
Disentangling Object Motion and Occlusion for Unsupervised Multi-frame Monocular Depth [37.021579239596164]
Existing dynamic-object-focused methods only partially solved the mismatch problem at the training loss level. We propose a novel multi-frame monocular depth prediction method to solve these problems at both the prediction and supervision loss levels. Our method, called DynamicDepth, is a new framework trained via a self-supervised cycle consistent learning scheme.
arXiv Detail & Related papers (2022-03-29T01:36:11Z)
SelfTune: Metrically Scaled Monocular Depth Estimation through Self-Supervised Learning [53.78813049373321]
We propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation. Our approach is useful for various applications such as mobile robot navigation and is applicable to diverse environments.
arXiv Detail & Related papers (2022-03-10T12:28:42Z)
Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video. Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer. To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z)
Self-Supervised Joint Learning Framework of Depth Estimation via Implicit Cues [24.743099160992937]
We propose a novel self-supervised joint learning framework for depth estimation. The proposed framework outperforms the state-of-the-art(SOTA) on KITTI and Make3D datasets.
arXiv Detail & Related papers (2020-06-17T13:56:59Z)
Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video. Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.