GroCo: Ground Constraint for Metric Self-Supervised Monocular Depth
- URL: http://arxiv.org/abs/2409.14850v1
- Date: Mon, 23 Sep 2024 09:30:27 GMT
- Title: GroCo: Ground Constraint for Metric Self-Supervised Monocular Depth
- Authors: Aurélien Cecille, Stefan Duffner, Franck Davoine, Thibault Neveu, Rémi Agier,
- Abstract summary: We propose a novel constraint on ground areas designed specifically for the self-supervised paradigm.
This mechanism not only allows to accurately recover the scale but also ensures coherence between the depth prediction and the ground prior.
- Score: 2.805351469151152
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Monocular depth estimation has greatly improved in the recent years but models predicting metric depth still struggle to generalize across diverse camera poses and datasets. While recent supervised methods mitigate this issue by leveraging ground prior information at inference, their adaptability to self-supervised settings is limited due to the additional challenge of scale recovery. Addressing this gap, we propose in this paper a novel constraint on ground areas designed specifically for the self-supervised paradigm. This mechanism not only allows to accurately recover the scale but also ensures coherence between the depth prediction and the ground prior. Experimental results show that our method surpasses existing scale recovery techniques on the KITTI benchmark and significantly enhances model generalization capabilities. This improvement can be observed by its more robust performance across diverse camera rotations and its adaptability in zero-shot conditions with previously unseen driving datasets such as DDAD.
Related papers
- TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs [5.6168844664788855]
This work presents TanDepth, a practical, online scale recovery method for obtaining metric depth results from relative estimations at inference-time.
Tailored for Unmanned Aerial Vehicle (UAV) applications, our method leverages sparse measurements from Global Digital Elevation Models (GDEM) by projecting them to the camera view.
An adaptation to the Cloth Simulation Filter is presented, which allows selecting ground points from the estimated depth map to then correlate with the projected reference points.
arXiv Detail & Related papers (2024-09-08T15:54:43Z) - Advancing Cross-Domain Generalizability in Face Anti-Spoofing: Insights, Design, and Metrics [10.631157315662607]
This paper presents a novel perspective for enhancing anti-spoofing performance in zero-shot data domain generalization.
One step forward to the previous frame-wise spoofing prediction, we introduce a nuanced metric calculation that aggregates frame-level probabilities for a video-wise prediction.
Our final model outperforms existing state-of-the-art methods across the datasets.
arXiv Detail & Related papers (2024-06-18T04:15:22Z) - Robust Geometry-Preserving Depth Estimation Using Differentiable
Rendering [93.94371335579321]
We propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.
Comprehensive experiments underscore our framework's superior generalization capabilities.
Our innovative loss functions empower the model to autonomously recover domain-specific scale-and-shift coefficients.
arXiv Detail & Related papers (2023-09-18T12:36:39Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization [52.7137956951533]
We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research.
We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift.
Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
arXiv Detail & Related papers (2022-02-14T16:42:16Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Excavating the Potential Capacity of Self-Supervised Monocular Depth
Estimation [10.620856690388376]
We show that the potential capacity of self-supervised monocular depth estimation can be excavated without increasing this cost.
Our contributions can bring significant performance improvement to the baseline with even less computational overhead.
Our model, named EPCDepth, surpasses the previous state-of-the-art methods even those supervised by additional constraints.
arXiv Detail & Related papers (2021-09-26T03:40:56Z) - Calibrating Self-supervised Monocular Depth Estimation [77.77696851397539]
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal.
We show that incorporating prior information about the camera configuration and the environment, we can remove the scale ambiguity and predict depth directly, still using the self-supervised formulation and not relying on any additional sensors.
arXiv Detail & Related papers (2020-09-16T14:35:45Z) - DESC: Domain Adaptation for Depth Estimation via Semantic Consistency [24.13837264978472]
We propose a domain adaptation approach to train a monocular depth estimation model.
We bridge the domain gap by leveraging semantic predictions and low-level edge features.
Our approach is evaluated on standard domain adaptation benchmarks for monocular depth estimation.
arXiv Detail & Related papers (2020-09-03T10:54:05Z) - Self-supervised Monocular Trained Depth Estimation using Self-attention
and Discrete Disparity Volume [19.785343302320918]
We propose two new ideas to improve self-supervised monocular trained depth estimation: 1) self-attention, and 2) discrete disparity prediction.
We show that the extension of the state-of-the-art self-supervised monocular trained depth estimator Monodepth2 with these two ideas allows us to design a model that produces the best results in the field in KITTI 2015 and Make3D.
arXiv Detail & Related papers (2020-03-31T04:48:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.