Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors
- URL: http://arxiv.org/abs/2503.20211v1
- Date: Wed, 26 Mar 2025 04:12:54 GMT
- Title: Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors
- Authors: Weilong Yan, Ming Li, Haipeng Li, Shuwei Shao, Robby T. Tan,
- Abstract summary: We present the first synthetic-to-real robust depth estimation framework, incorporating motion and structure priors to capture real-world knowledge effectively.<n>We achieve improvements of 7.5% and 4.3% in AbsRel and RMSE on average for nuScenes and Robotcar datasets (daytime, nighttime, rain)<n>In zero-shot evaluation of DrivingStereo (rain, fog), our method generalizes better than the previous ones.
- Score: 22.831281986234988
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised depth estimation from monocular cameras in diverse outdoor conditions, such as daytime, rain, and nighttime, is challenging due to the difficulty of learning universal representations and the severe lack of labeled real-world adverse data. Previous methods either rely on synthetic inputs and pseudo-depth labels or directly apply daytime strategies to adverse conditions, resulting in suboptimal results. In this paper, we present the first synthetic-to-real robust depth estimation framework, incorporating motion and structure priors to capture real-world knowledge effectively. In the synthetic adaptation, we transfer motion-structure knowledge inside cost volumes for better robust representation, using a frozen daytime model to train a depth estimator in synthetic adverse conditions. In the innovative real adaptation, which targets to fix synthetic-real gaps, models trained earlier identify the weather-insensitive regions with a designed consistency-reweighting strategy to emphasize valid pseudo-labels. We introduce a new regularization by gathering explicit depth distributions to constrain the model when facing real-world data. Experiments show that our method outperforms the state-of-the-art across diverse conditions in multi-frame and single-frame evaluations. We achieve improvements of 7.5% and 4.3% in AbsRel and RMSE on average for nuScenes and Robotcar datasets (daytime, nighttime, rain). In zero-shot evaluation of DrivingStereo (rain, fog), our method generalizes better than the previous ones.
Related papers
- DepthFM: Fast Monocular Depth Estimation with Flow Matching [22.206355073676082]
Current discriminative depth estimation methods often produce blurry artifacts, while generative approaches suffer from slow sampling due to curvatures in the noise-to-depth transport.<n>Our method addresses these challenges by framing depth estimation as a direct transport between image and depth distributions.<n>Our approach achieves competitive zero-shot performance on standard benchmarks of complex natural scenes while improving sampling efficiency and only requiring minimal synthetic data for training.
arXiv Detail & Related papers (2024-03-20T17:51:53Z) - Learning Robust Precipitation Forecaster by Temporal Frame Interpolation [65.5045412005064]
We develop a robust precipitation forecasting model that demonstrates resilience against spatial-temporal discrepancies.
Our approach has led to significant improvements in forecasting precision, culminating in our model securing textit1st place in the transfer learning leaderboard of the textitWeather4cast'23 competition.
arXiv Detail & Related papers (2023-11-30T08:22:08Z) - A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies.
Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance.
Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z) - WeatherDepth: Curriculum Contrastive Learning for Self-Supervised Depth Estimation under Adverse Weather Conditions [42.99525455786019]
We propose WeatherDepth, a self-supervised robust depth estimation model with curriculum contrastive learning.
The proposed solution is proven to be easily incorporated into various architectures and demonstrates state-of-the-art (SoTA) performance on both synthetic and real weather datasets.
arXiv Detail & Related papers (2023-10-09T09:26:27Z) - Robust Monocular Depth Estimation under Challenging Conditions [81.57697198031975]
State-of-the-art monocular depth estimation approaches are highly unreliable under challenging illumination and weather conditions.
We tackle these safety-critical issues with md4all: a simple and effective solution that works reliably under both adverse and ideal conditions.
arXiv Detail & Related papers (2023-08-18T17:59:01Z) - PointNorm-Net: Self-Supervised Normal Prediction of 3D Point Clouds via Multi-Modal Distribution Estimation [29.582507073730913]
PointNorm-Net is the first self-supervised deep learning framework to tackle this challenge.
Our method achieves superior generalization and outperforms state-of-the-art conventional and deep learning approaches across three real-world datasets.
arXiv Detail & Related papers (2023-04-10T22:11:13Z) - Vision in adverse weather: Augmentation using CycleGANs with various
object detectors for robust perception in autonomous racing [70.16043883381677]
In autonomous racing, the weather can change abruptly, causing significant degradation in perception, resulting in ineffective manoeuvres.
In order to improve detection in adverse weather, deep-learning-based models typically require extensive datasets captured in such conditions.
We introduce an approach of using synthesised adverse condition datasets in autonomous racing (generated using CycleGAN) to improve the performance of four out of five state-of-the-art detectors.
arXiv Detail & Related papers (2022-01-10T10:02:40Z) - Lidar Light Scattering Augmentation (LISA): Physics-based Simulation of
Adverse Weather Conditions for 3D Object Detection [60.89616629421904]
Lidar-based object detectors are critical parts of the 3D perception pipeline in autonomous navigation systems such as self-driving cars.
They are sensitive to adverse weather conditions such as rain, snow and fog due to reduced signal-to-noise ratio (SNR) and signal-to-background ratio (SBR)
arXiv Detail & Related papers (2021-07-14T21:10:47Z) - Semi-Supervised Video Deraining with Dynamic Rain Generator [59.71640025072209]
This paper proposes a new semi-supervised video deraining method, in which a dynamic rain generator is employed to fit the rain layer.
Specifically, such dynamic generator consists of one emission model and one transition model to simultaneously encode the spatially physical structure and temporally continuous changes of rain streaks.
Various prior formats are designed for the labeled synthetic and unlabeled real data, so as to fully exploit the common knowledge underlying them.
arXiv Detail & Related papers (2021-03-14T14:28:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.