Related papers: Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors

Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors

URL: http://arxiv.org/abs/2503.20211v1
Date: Wed, 26 Mar 2025 04:12:54 GMT
Title: Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors
Authors: Weilong Yan, Ming Li, Haipeng Li, Shuwei Shao, Robby T. Tan,
Abstract summary: We present the first synthetic-to-real robust depth estimation framework, incorporating motion and structure priors to capture real-world knowledge effectively.<n>We achieve improvements of 7.5% and 4.3% in AbsRel and RMSE on average for nuScenes and Robotcar datasets (daytime, nighttime, rain)<n>In zero-shot evaluation of DrivingStereo (rain, fog), our method generalizes better than the previous ones.
Score: 22.831281986234988
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised depth estimation from monocular cameras in diverse outdoor conditions, such as daytime, rain, and nighttime, is challenging due to the difficulty of learning universal representations and the severe lack of labeled real-world adverse data. Previous methods either rely on synthetic inputs and pseudo-depth labels or directly apply daytime strategies to adverse conditions, resulting in suboptimal results. In this paper, we present the first synthetic-to-real robust depth estimation framework, incorporating motion and structure priors to capture real-world knowledge effectively. In the synthetic adaptation, we transfer motion-structure knowledge inside cost volumes for better robust representation, using a frozen daytime model to train a depth estimator in synthetic adverse conditions. In the innovative real adaptation, which targets to fix synthetic-real gaps, models trained earlier identify the weather-insensitive regions with a designed consistency-reweighting strategy to emphasize valid pseudo-labels. We introduce a new regularization by gathering explicit depth distributions to constrain the model when facing real-world data. Experiments show that our method outperforms the state-of-the-art across diverse conditions in multi-frame and single-frame evaluations. We achieve improvements of 7.5% and 4.3% in AbsRel and RMSE on average for nuScenes and Robotcar datasets (daytime, nighttime, rain). In zero-shot evaluation of DrivingStereo (rain, fog), our method generalizes better than the previous ones.

Related papers

SSSUMO: Real-Time Semi-Supervised Submovement Decomposition [0.6499759302108926]
Submovement analysis offers valuable insights into motor control.<n>Existing methods struggle with reconstruction accuracy, computational cost, and validation.<n>We address these challenges using a semi-supervised learning framework.
arXiv Detail & Related papers (2025-07-08T21:26:25Z)
Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining [73.5575992346396]
We propose a dual-branch-temporal state-space model to enhance rain streak removal in video sequences.<n>To improve multi-frame feature fusion, we derive a dynamic filter stacking, which adaptively approximates statistical filters for pixel-wise feature refinement.<n>To further explore the capacity of deraining models in supporting other vision-based tasks in rainy environments, we introduce a novel real-world benchmark.
arXiv Detail & Related papers (2025-05-22T15:50:00Z)
DepthFM: Fast Monocular Depth Estimation with Flow Matching [22.206355073676082]
Current discriminative depth estimation methods often produce blurry artifacts, while generative approaches suffer from slow sampling due to curvatures in the noise-to-depth transport.<n>Our method addresses these challenges by framing depth estimation as a direct transport between image and depth distributions.<n>Our approach achieves competitive zero-shot performance on standard benchmarks of complex natural scenes while improving sampling efficiency and only requiring minimal synthetic data for training.
arXiv Detail & Related papers (2024-03-20T17:51:53Z)
Learning Robust Precipitation Forecaster by Temporal Frame Interpolation [65.5045412005064]
We develop a robust precipitation forecasting model that demonstrates resilience against spatial-temporal discrepancies. Our approach has led to significant improvements in forecasting precision, culminating in our model securing textit1st place in the transfer learning leaderboard of the textitWeather4cast'23 competition.
arXiv Detail & Related papers (2023-11-30T08:22:08Z)
A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies. Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance. Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z)
WeatherDepth: Curriculum Contrastive Learning for Self-Supervised Depth Estimation under Adverse Weather Conditions [42.99525455786019]
We propose WeatherDepth, a self-supervised robust depth estimation model with curriculum contrastive learning. The proposed solution is proven to be easily incorporated into various architectures and demonstrates state-of-the-art (SoTA) performance on both synthetic and real weather datasets.
arXiv Detail & Related papers (2023-10-09T09:26:27Z)
Robust Monocular Depth Estimation under Challenging Conditions [81.57697198031975]
State-of-the-art monocular depth estimation approaches are highly unreliable under challenging illumination and weather conditions. We tackle these safety-critical issues with md4all: a simple and effective solution that works reliably under both adverse and ideal conditions.
arXiv Detail & Related papers (2023-08-18T17:59:01Z)
PointNorm-Net: Self-Supervised Normal Prediction of 3D Point Clouds via Multi-Modal Distribution Estimation [29.582507073730913]
PointNorm-Net is the first self-supervised deep learning framework to tackle this challenge. Our method achieves superior generalization and outperforms state-of-the-art conventional and deep learning approaches across three real-world datasets.
arXiv Detail & Related papers (2023-04-10T22:11:13Z)
Vision in adverse weather: Augmentation using CycleGANs with various object detectors for robust perception in autonomous racing [70.16043883381677]
In autonomous racing, the weather can change abruptly, causing significant degradation in perception, resulting in ineffective manoeuvres. In order to improve detection in adverse weather, deep-learning-based models typically require extensive datasets captured in such conditions. We introduce an approach of using synthesised adverse condition datasets in autonomous racing (generated using CycleGAN) to improve the performance of four out of five state-of-the-art detectors.
arXiv Detail & Related papers (2022-01-10T10:02:40Z)
Lidar Light Scattering Augmentation (LISA): Physics-based Simulation of Adverse Weather Conditions for 3D Object Detection [60.89616629421904]
Lidar-based object detectors are critical parts of the 3D perception pipeline in autonomous navigation systems such as self-driving cars. They are sensitive to adverse weather conditions such as rain, snow and fog due to reduced signal-to-noise ratio (SNR) and signal-to-background ratio (SBR)
arXiv Detail & Related papers (2021-07-14T21:10:47Z)
Semi-Supervised Video Deraining with Dynamic Rain Generator [59.71640025072209]
This paper proposes a new semi-supervised video deraining method, in which a dynamic rain generator is employed to fit the rain layer. Specifically, such dynamic generator consists of one emission model and one transition model to simultaneously encode the spatially physical structure and temporally continuous changes of rain streaks. Various prior formats are designed for the labeled synthetic and unlabeled real data, so as to fully exploit the common knowledge underlying them.
arXiv Detail & Related papers (2021-03-14T14:28:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.