CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching
- URL: http://arxiv.org/abs/2104.04314v1
- Date: Fri, 9 Apr 2021 11:38:59 GMT
- Title: CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching
- Authors: Zhelun Shen, Yuchao Dai, Zhibo Rao
- Abstract summary: We propose CFNet, a Cascade and Fused cost volume based network to improve the robustness of the stereo matching network.
We employ a variance-based uncertainty estimation to adaptively adjust the next stage disparity search space.
Our proposed method achieves the state-of-the-art overall performance and obtains the 1st place on the stereo task of Robust Vision Challenge 2020.
- Score: 27.313740022587442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the ever-increasing capacity of large-scale annotated datasets has
led to profound progress in stereo matching. However, most of these successes
are limited to a specific dataset and cannot generalize well to other datasets.
The main difficulties lie in the large domain differences and unbalanced
disparity distribution across a variety of datasets, which greatly limit the
real-world applicability of current deep stereo matching models. In this paper,
we propose CFNet, a Cascade and Fused cost volume based network to improve the
robustness of the stereo matching network. First, we propose a fused cost
volume representation to deal with the large domain difference. By fusing
multiple low-resolution dense cost volumes to enlarge the receptive field, we
can extract robust structural representations for initial disparity estimation.
Second, we propose a cascade cost volume representation to alleviate the
unbalanced disparity distribution. Specifically, we employ a variance-based
uncertainty estimation to adaptively adjust the next stage disparity search
space, in this way driving the network progressively prune out the space of
unlikely correspondences. By iteratively narrowing down the disparity search
space and improving the cost volume resolution, the disparity estimation is
gradually refined in a coarse-to-fine manner. When trained on the same training
images and evaluated on KITTI, ETH3D, and Middlebury datasets with the fixed
model parameters and hyperparameters, our proposed method achieves the
state-of-the-art overall performance and obtains the 1st place on the stereo
task of Robust Vision Challenge 2020. The code will be available at
https://github.com/gallenszl/CFNet.
Related papers
- Rectified Iterative Disparity for Stereo Matching [2.8908326904081334]
We propose Cost volume-based disparity Uncertainty Estimation (UEC)
Based on the rich similarity information in the cost volume coming from the image pairs, the proposed UEC can achieve competitive performance with low computational cost.
We present a high-performance stereo architecture, DR Stereo, which is a combination of the proposed methods.
arXiv Detail & Related papers (2024-06-16T13:47:40Z) - ACVNet: Attention Concatenation Volume for Accurate and Efficient Stereo
Matching [7.39503547452922]
We present a novel cost volume construction method which generates attention weights from correlation clues to suppress redundant information.
To generate reliable attention weights, we propose multi-level adaptive patch matching to improve the distinctiveness of the matching cost.
The proposed cost volume is named attention concatenation volume (ACV) which can be seamlessly embedded into most stereo matching networks.
arXiv Detail & Related papers (2022-03-04T06:28:58Z) - AdaStereo: An Efficient Domain-Adaptive Stereo Matching Approach [50.855679274530615]
We present a novel domain-adaptive approach called AdaStereo to align multi-level representations for deep stereo matching networks.
Our models achieve state-of-the-art cross-domain performance on multiple benchmarks, including KITTI, Middlebury, ETH3D and DrivingStereo.
Our method is robust to various domain adaptation settings, and can be easily integrated into quick adaptation application scenarios and real-world deployments.
arXiv Detail & Related papers (2021-12-09T15:10:47Z) - Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume
Excitation [65.83008812026635]
We construct Guided Cost volume Excitation (GCE) and show that simple channel excitation of cost volume guided by image can improve performance considerably.
We present an end-to-end network that we call Correlate-and-Excite (CoEx)
arXiv Detail & Related papers (2021-08-12T14:32:26Z) - SMD-Nets: Stereo Mixture Density Networks [68.56947049719936]
We propose Stereo Mixture Density Networks (SMD-Nets), a simple yet effective learning framework compatible with a wide class of 2D and 3D architectures.
Specifically, we exploit bimodal mixture densities as output representation and show that this allows for sharp and precise disparity estimates near discontinuities.
We carry out comprehensive experiments on a new high-resolution and highly realistic synthetic stereo dataset, consisting of stereo pairs at 8Mpx resolution, as well as on real-world stereo datasets.
arXiv Detail & Related papers (2021-04-08T16:15:46Z) - ADCPNet: Adaptive Disparity Candidates Prediction Network for Efficient
Real-Time Stereo Matching [8.046317778069325]
coarse-to-fine method has largely relieved the memory constraints and speed limitations of large-scale network models.
Previous coarse-to-fine designs employ constant offsets and three or more stages to progressively refine the coarse disparity map.
This paper claims that the coarse matching errors can be corrected efficiently with fewer stages as long as more accurate disparity candidates can be provided.
arXiv Detail & Related papers (2020-11-18T01:18:52Z) - AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching [50.06646151004375]
A novel domain-adaptive pipeline called AdaStereo aims to align multi-level representations for deep stereo matching networks.
Our AdaStereo models achieve state-of-the-art cross-domain performance on multiple stereo benchmarks, including KITTI, Middlebury, ETH3D, and DrivingStereo.
arXiv Detail & Related papers (2020-04-09T16:15:13Z) - JHU-CROWD++: Large-Scale Crowd Counting Dataset and A Benchmark Method [92.15895515035795]
We introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD++) that contains "4,372" images with "1.51 million" annotations.
We propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation.
arXiv Detail & Related papers (2020-04-07T14:59:35Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - Learning Inverse Depth Regression for Multi-View Stereo with Correlation
Cost Volume [32.41293572426403]
Deep learning has shown to be effective for depth inference in multi-view stereo (MVS)
However, the scalability and accuracy still remain an open problem in this domain.
Inspired by the group-wise correlation in stereo matching, we propose an average group-wise correlation similarity measure to construct a lightweight cost volume.
arXiv Detail & Related papers (2019-12-26T01:40:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.