Active-Passive SimStereo -- Benchmarking the Cross-Generalization
Capabilities of Deep Learning-based Stereo Methods
- URL: http://arxiv.org/abs/2209.08305v1
- Date: Sat, 17 Sep 2022 10:30:32 GMT
- Title: Active-Passive SimStereo -- Benchmarking the Cross-Generalization
Capabilities of Deep Learning-based Stereo Methods
- Authors: Laurent Jospin and Allen Antony and Lian Xu and Hamid Laga and Farid
Boussaid and Mohammed Bennamoun
- Abstract summary: Self-similar or bland regions can make it difficult to match patches between two images.
Active stereo-based methods mitigate this problem by projecting a pseudo-random pattern on the scene.
If this pattern acts as a form of adversarial noise, it could negatively impact the performance of deep learning-based methods.
- Score: 26.662129158141763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In stereo vision, self-similar or bland regions can make it difficult to
match patches between two images. Active stereo-based methods mitigate this
problem by projecting a pseudo-random pattern on the scene so that each patch
of an image pair can be identified without ambiguity. However, the projected
pattern significantly alters the appearance of the image. If this pattern acts
as a form of adversarial noise, it could negatively impact the performance of
deep learning-based methods, which are now the de-facto standard for dense
stereo vision. In this paper, we propose the Active-Passive SimStereo dataset
and a corresponding benchmark to evaluate the performance gap between passive
and active stereo images for stereo matching algorithms. Using the proposed
benchmark and an additional ablation study, we show that the feature extraction
and matching modules of a selection of twenty selected deep learning-based
stereo matching methods generalize to active stereo without a problem. However,
the disparity refinement modules of three of the twenty architectures (ACVNet,
CascadeStereo, and StereoNet) are negatively affected by the active stereo
patterns due to their reliance on the appearance of the input images.
Related papers
- MaDis-Stereo: Enhanced Stereo Matching via Distilled Masked Image Modeling [18.02254687807291]
Transformer-based stereo models have been studied recently, their performance still lags behind CNN-based stereo models due to the inherent data scarcity issue in the stereo matching task.
We propose Masked Image Modeling Distilled Stereo matching model, termed MaDis-Stereo, that enhances locality inductive bias by leveraging Masked Image Modeling (MIM) in training Transformer-based stereo model.
arXiv Detail & Related papers (2024-09-04T16:17:45Z) - UniTT-Stereo: Unified Training of Transformer for Enhanced Stereo Matching [18.02254687807291]
UniTT-Stereo is a method to maximize the potential of Transformer-based stereo architectures.
State-of-the-art performance of UniTT-Stereo is validated on various benchmarks such as ETH3D, KITTI 2012, and KITTI 2015 datasets.
arXiv Detail & Related papers (2024-09-04T09:02:01Z) - Modeling Stereo-Confidence Out of the End-to-End Stereo-Matching Network
via Disparity Plane Sweep [31.261772846687297]
The proposed stereo-confidence method is built upon the idea that any shift in a stereo-image pair should be updated in a corresponding amount shift in the disparity map.
By comparing the desirable and predicted disparity profiles, we can quantify the level of matching ambiguity between left and right images for confidence measurement.
arXiv Detail & Related papers (2024-01-22T14:52:08Z) - Single-View View Synthesis with Self-Rectified Pseudo-Stereo [49.946151180828465]
We leverage the reliable and explicit stereo prior to generate a pseudo-stereo viewpoint.
We propose a self-rectified stereo synthesis to amend erroneous regions in an identify-rectify manner.
Our method outperforms state-of-the-art single-view view synthesis methods and stereo synthesis methods.
arXiv Detail & Related papers (2023-04-19T09:36:13Z) - Anomalous Sound Detection using Audio Representation with Machine ID
based Contrastive Learning Pretraining [52.191658157204856]
This paper uses contrastive learning to refine audio representations for each machine ID, rather than for each audio sample.
The proposed two-stage method uses contrastive learning to pretrain the audio representation model.
Experiments show that our method outperforms the state-of-the-art methods using contrastive learning or self-supervised classification.
arXiv Detail & Related papers (2023-04-07T11:08:31Z) - Bayesian Learning for Disparity Map Refinement for Semi-Dense Active
Stereo Vision [30.330599857204344]
We propose a new learning strategy to train neural networks to estimate high-quality subpixel disparity maps for semi-dense active stereo vision.
We demonstrate that the proposed method outperforms the current state-of-the-art active stereo models.
arXiv Detail & Related papers (2022-09-12T08:33:40Z) - Revisiting Domain Generalized Stereo Matching Networks from a Feature
Consistency Perspective [65.37571681370096]
We propose a simple pixel-wise contrastive learning across the viewpoints.
A stereo selective whitening loss is introduced to better preserve the stereo feature consistency across domains.
Our method achieves superior performance over several state-of-the-art networks.
arXiv Detail & Related papers (2022-03-21T11:21:41Z) - Polka Lines: Learning Structured Illumination and Reconstruction for
Active Stereo [52.68109922159688]
We introduce a novel differentiable image formation model for active stereo, relying on both wave and geometric optics, and a novel trinocular reconstruction network.
The jointly optimized pattern, which we dub "Polka Lines," together with the reconstruction network, achieve state-of-the-art active-stereo depth estimates across imaging conditions.
arXiv Detail & Related papers (2020-11-26T04:02:43Z) - Parallax Attention for Unsupervised Stereo Correspondence Learning [46.035892564279564]
Stereo image pairs encode 3D scene cues into stereo correspondences between the left and right images.
Recent CNN based methods commonly use cost volume techniques to capture stereo correspondence over large disparities.
We propose a generic parallax-attention mechanism (PAM) to capture stereo correspondence regardless of disparity variations.
arXiv Detail & Related papers (2020-09-16T01:30:13Z) - AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching [50.06646151004375]
A novel domain-adaptive pipeline called AdaStereo aims to align multi-level representations for deep stereo matching networks.
Our AdaStereo models achieve state-of-the-art cross-domain performance on multiple stereo benchmarks, including KITTI, Middlebury, ETH3D, and DrivingStereo.
arXiv Detail & Related papers (2020-04-09T16:15:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.