SCV-Stereo: Learning Stereo Matching from a Sparse Cost Volume
- URL: http://arxiv.org/abs/2107.08187v1
- Date: Sat, 17 Jul 2021 05:45:44 GMT
- Title: SCV-Stereo: Learning Stereo Matching from a Sparse Cost Volume
- Authors: Hengli Wang, Rui Fan, Ming Liu
- Abstract summary: Convolutional neural network (CNN)-based stereo matching approaches generally require a dense cost volume (DCV) for disparity estimation.
We propose SCV-Stereo, a novel CNN architecture, capable of learning dense stereo matching from sparse cost volume representations.
Our inspiration is derived from the fact that DCV representations are somewhat redundant and can be replaced with SCV representations.
- Score: 14.801038005597855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional neural network (CNN)-based stereo matching approaches generally
require a dense cost volume (DCV) for disparity estimation. However, generating
such cost volumes is computationally-intensive and memory-consuming, hindering
CNN training and inference efficiency. To address this problem, we propose
SCV-Stereo, a novel CNN architecture, capable of learning dense stereo matching
from sparse cost volume (SCV) representations. Our inspiration is derived from
the fact that DCV representations are somewhat redundant and can be replaced
with SCV representations. Benefiting from these SCV representations, our
SCV-Stereo can update disparity estimations in an iterative fashion for
accurate and efficient stereo matching. Extensive experiments carried out on
the KITTI Stereo benchmarks demonstrate that our SCV-Stereo can significantly
minimize the trade-off between accuracy and efficiency for stereo matching. Our
project page is https://sites.google.com/view/scv-stereo.
Related papers
- RomniStereo: Recurrent Omnidirectional Stereo Matching [6.153793254880079]
We propose a recurrent omnidirectional stereo matching (RomniStereo) algorithm.
Our best model improves the average MAE metric by 40.7% over the previous SOTA baseline.
When visualizing the results, our models demonstrate clear advantages on both synthetic and realistic examples.
arXiv Detail & Related papers (2024-01-09T04:06:01Z) - Audio classification with Dilated Convolution with Learnable Spacings [10.89964981012741]
Dilated convolution with learnable spacings (DCLS) is a recent convolution method in which the positions of the kernel elements are learned throughout training by backpropagation.
Here we show that DCLS is also useful for audio tagging using the AudioSet classification benchmark.
arXiv Detail & Related papers (2023-09-25T09:09:54Z) - Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo
Matching [77.133400999703]
Correlation based stereo matching has achieved outstanding performance.
Current methods with a fixed model do not work uniformly well across various datasets.
This paper proposes a new perspective to dynamically calculate correlation for robust stereo matching.
arXiv Detail & Related papers (2023-07-26T09:47:37Z) - Accurate and Efficient Stereo Matching via Attention Concatenation
Volume [33.615312186946866]
We present a novel cost volume construction method, named attention concatenation volume (ACV)
ACV generates attention weights from correlation clues to suppress redundant information and enhance matching-related information in the concatenation volume.
We further design a fast version of ACV to enable real-time performance, named Fast-ACV, which generates high likelihood disparity hypotheses.
arXiv Detail & Related papers (2022-09-23T08:14:30Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - ACVNet: Attention Concatenation Volume for Accurate and Efficient Stereo
Matching [7.39503547452922]
We present a novel cost volume construction method which generates attention weights from correlation clues to suppress redundant information.
To generate reliable attention weights, we propose multi-level adaptive patch matching to improve the distinctiveness of the matching cost.
The proposed cost volume is named attention concatenation volume (ACV) which can be seamlessly embedded into most stereo matching networks.
arXiv Detail & Related papers (2022-03-04T06:28:58Z) - Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume
Excitation [65.83008812026635]
We construct Guided Cost volume Excitation (GCE) and show that simple channel excitation of cost volume guided by image can improve performance considerably.
We present an end-to-end network that we call Correlate-and-Excite (CoEx)
arXiv Detail & Related papers (2021-08-12T14:32:26Z) - Efficient Training of Visual Transformers with Small-Size Datasets [64.60765211331697]
Visual Transformers (VTs) are emerging as an architectural paradigm alternative to Convolutional networks (CNNs)
We show that, despite having a comparable accuracy when trained on ImageNet, their performance on smaller datasets can be largely different.
We propose a self-supervised task which can extract additional information from images with only a negligible computational overhead.
arXiv Detail & Related papers (2021-06-07T16:14:06Z) - PVStereo: Pyramid Voting Module for End-to-End Self-Supervised Stereo
Matching [14.603116313499648]
We propose a robust and effective self-supervised stereo matching approach, consisting of a pyramid voting module (PVM) and a novel DCNN architecture, referred to as OptStereo.
Specifically, our OptStereo first builds multi-scale cost volumes, and then adopts a recurrent unit to iteratively update disparity estimations at high resolution.
We publish the HKUST-Drive dataset, a large-scale synthetic stereo dataset, collected under different illumination and weather conditions for research purposes.
arXiv Detail & Related papers (2021-03-12T05:27:14Z) - AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching [50.06646151004375]
A novel domain-adaptive pipeline called AdaStereo aims to align multi-level representations for deep stereo matching networks.
Our AdaStereo models achieve state-of-the-art cross-domain performance on multiple stereo benchmarks, including KITTI, Middlebury, ETH3D, and DrivingStereo.
arXiv Detail & Related papers (2020-04-09T16:15:13Z) - On Coresets for Support Vector Machines [61.928187390362176]
A coreset is a small, representative subset of the original data points.
We show that our algorithm can be used to extend the applicability of any off-the-shelf SVM solver to streaming, distributed, and dynamic data settings.
arXiv Detail & Related papers (2020-02-15T23:25:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.