NeRF-Supervised Deep Stereo
- URL: http://arxiv.org/abs/2303.17603v1
- Date: Thu, 30 Mar 2023 17:59:58 GMT
- Title: NeRF-Supervised Deep Stereo
- Authors: Fabio Tosi, Alessio Tonioni, Daniele De Gregorio, Matteo Poggi
- Abstract summary: We introduce a novel framework for training deep stereo networks effortlessly and without any ground-truth.
By leveraging state-of-the-art neural rendering solutions, we generate stereo training data from image sequences collected with a single handheld camera.
On top of them, a NeRF-supervised training procedure is carried out, from which we exploit rendered stereo triplets to compensate for occlusions and depth maps as proxy labels.
- Score: 33.54504171850584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel framework for training deep stereo networks effortlessly
and without any ground-truth. By leveraging state-of-the-art neural rendering
solutions, we generate stereo training data from image sequences collected with
a single handheld camera. On top of them, a NeRF-supervised training procedure
is carried out, from which we exploit rendered stereo triplets to compensate
for occlusions and depth maps as proxy labels. This results in stereo networks
capable of predicting sharp and detailed disparity maps. Experimental results
show that models trained under this regime yield a 30-40% improvement over
existing self-supervised methods on the challenging Middlebury dataset, filling
the gap to supervised models and, most times, outperforming them at zero-shot
generalization.
Related papers
- Self-Assessed Generation: Trustworthy Label Generation for Optical Flow and Stereo Matching in Real-world [24.251352190100135]
We propose a unified self-supervised generalization framework for optical flow and stereo tasks: Self-Assessed Generation (SAG).
Unlike previous self-supervised methods, SAG is data-driven, using advanced reconstruction techniques to construct a reconstruction field from RGB images and generate datasets based on it.
arXiv Detail & Related papers (2024-10-14T12:46:17Z) - MaDis-Stereo: Enhanced Stereo Matching via Distilled Masked Image Modeling [18.02254687807291]
Transformer-based stereo models have been studied recently, their performance still lags behind CNN-based stereo models due to the inherent data scarcity issue in the stereo matching task.
We propose Masked Image Modeling Distilled Stereo matching model, termed MaDis-Stereo, that enhances locality inductive bias by leveraging Masked Image Modeling (MIM) in training Transformer-based stereo model.
arXiv Detail & Related papers (2024-09-04T16:17:45Z) - Deceptive-NeRF/3DGS: Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction [60.52716381465063]
We introduce Deceptive-NeRF/3DGS to enhance sparse-view reconstruction with only a limited set of input images.
Specifically, we propose a deceptive diffusion model turning noisy images rendered from few-view reconstructions into high-quality pseudo-observations.
Our system progressively incorporates diffusion-generated pseudo-observations into the training image sets, ultimately densifying the sparse input observations by 5 to 10 times.
arXiv Detail & Related papers (2023-05-24T14:00:32Z) - Stereo Matching by Self-supervision of Multiscopic Vision [65.38359887232025]
We propose a new self-supervised framework for stereo matching utilizing multiple images captured at aligned camera positions.
A cross photometric loss, an uncertainty-aware mutual-supervision loss, and a new smoothness loss are introduced to optimize the network.
Our model obtains better disparity maps than previous unsupervised methods on the KITTI dataset.
arXiv Detail & Related papers (2021-04-09T02:58:59Z) - Unsupervised Monocular Depth Learning with Integrated Intrinsics and
Spatio-Temporal Constraints [61.46323213702369]
This work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion.
Our results demonstrate strong performance when compared to the current state-of-the-art on multiple sequences of the KITTI driving dataset.
arXiv Detail & Related papers (2020-11-02T22:26:58Z) - Reversing the cycle: self-supervised deep stereo through enhanced
monocular distillation [51.714092199995044]
In many fields, self-supervised learning solutions are rapidly evolving and filling the gap with supervised approaches.
We propose a novel self-supervised paradigm reversing the link between the two.
In order to train deep stereo networks, we distill knowledge through a monocular completion network.
arXiv Detail & Related papers (2020-08-17T07:40:22Z) - Learning Stereo from Single Images [41.32821954097483]
Supervised deep networks are among the best methods for finding correspondences in stereo image pairs.
We propose that it is unnecessary to have such a high reliance on ground truth depths or even corresponding stereo pairs.
Inspired by recent progress in monocular depth estimation, we generate plausible disparity maps from single images. In turn, we use those flawed disparity maps in a carefully designed pipeline to generate stereo training pairs.
arXiv Detail & Related papers (2020-08-04T12:22:21Z) - Auto-Rectify Network for Unsupervised Indoor Depth Estimation [119.82412041164372]
We establish that the complex ego-motions exhibited in handheld settings are a critical obstacle for learning depth.
We propose a data pre-processing method that rectifies training images by removing their relative rotations for effective learning.
Our results outperform the previous unsupervised SOTA method by a large margin on the challenging NYUv2 dataset.
arXiv Detail & Related papers (2020-06-04T08:59:17Z) - Identity Enhanced Residual Image Denoising [61.75610647978973]
We learn a fully-convolutional network model that consists of a Chain of Identity Mapping Modules and residual on the residual architecture for image denoising.
The proposed network produces remarkably higher numerical accuracy and better visual image quality than the classical state-of-the-art and CNN algorithms.
arXiv Detail & Related papers (2020-04-26T04:52:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.