ACVNet: Attention Concatenation Volume for Accurate and Efficient Stereo
Matching
- URL: http://arxiv.org/abs/2203.02146v1
- Date: Fri, 4 Mar 2022 06:28:58 GMT
- Title: ACVNet: Attention Concatenation Volume for Accurate and Efficient Stereo
Matching
- Authors: Gangwei Xu, Junda Cheng, Peng Guo, Xin Yang
- Abstract summary: We present a novel cost volume construction method which generates attention weights from correlation clues to suppress redundant information.
To generate reliable attention weights, we propose multi-level adaptive patch matching to improve the distinctiveness of the matching cost.
The proposed cost volume is named attention concatenation volume (ACV) which can be seamlessly embedded into most stereo matching networks.
- Score: 7.39503547452922
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stereo matching is a fundamental building block for many vision and robotics
applications. An informative and concise cost volume representation is vital
for stereo matching of high accuracy and efficiency. In this paper, we present
a novel cost volume construction method which generates attention weights from
correlation clues to suppress redundant information and enhance
matching-related information in the concatenation volume. To generate reliable
attention weights, we propose multi-level adaptive patch matching to improve
the distinctiveness of the matching cost at different disparities even for
textureless regions. The proposed cost volume is named attention concatenation
volume (ACV) which can be seamlessly embedded into most stereo matching
networks, the resulting networks can use a more lightweight aggregation network
and meanwhile achieve higher accuracy, e.g. using only 1/25 parameters of the
aggregation network can achieve higher accuracy for GwcNet. Furthermore, we
design a highly accurate network (ACVNet) based on our ACV, which achieves
state-of-the-art performance on several benchmarks.
Related papers
- Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data.
We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising.
Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z) - DCVSMNet: Double Cost Volume Stereo Matching Network [0.0]
DCVSMNet is a fast stereo matching network with a 67 ms inference time and strong generalization ability.
Results on several bench mark datasets show that DCVSMNet achieves better accuracy than methods such as CGI-Stereo and BGNet at the cost of greater inference time.
arXiv Detail & Related papers (2024-02-26T10:42:25Z) - Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural
Network [52.29330138835208]
Accurately matching local features between a pair of images is a challenging computer vision task.
Previous studies typically use attention based graph neural networks (GNNs) with fully-connected graphs over keypoints within/across images.
We propose MaKeGNN, a sparse attention-based GNN architecture which bypasses non-repeatable keypoints and leverages matchable ones to guide message passing.
arXiv Detail & Related papers (2023-07-04T02:50:44Z) - Accurate and Efficient Stereo Matching via Attention Concatenation
Volume [33.615312186946866]
We present a novel cost volume construction method, named attention concatenation volume (ACV)
ACV generates attention weights from correlation clues to suppress redundant information and enhance matching-related information in the concatenation volume.
We further design a fast version of ACV to enable real-time performance, named Fast-ACV, which generates high likelihood disparity hypotheses.
arXiv Detail & Related papers (2022-09-23T08:14:30Z) - AdaStereo: An Efficient Domain-Adaptive Stereo Matching Approach [50.855679274530615]
We present a novel domain-adaptive approach called AdaStereo to align multi-level representations for deep stereo matching networks.
Our models achieve state-of-the-art cross-domain performance on multiple benchmarks, including KITTI, Middlebury, ETH3D and DrivingStereo.
Our method is robust to various domain adaptation settings, and can be easily integrated into quick adaptation application scenarios and real-world deployments.
arXiv Detail & Related papers (2021-12-09T15:10:47Z) - Multi-View Stereo Network with attention thin volume [0.0]
We propose an efficient multi-view stereo (MVS) network for infering depth value from multiple RGB images.
We introduce the self-attention mechanism to fully aggregate the dominant information from input images.
We also introduce the group-wise correlation to feature aggregation, which greatly reduces the memory and calculation burden.
arXiv Detail & Related papers (2021-10-16T11:51:23Z) - Multi-Scale Cost Volumes Cascade Network for Stereo Matching [9.440848600106797]
We propose MSCVNet, which combines traditional methods and CNN to improve the quality of cost volume.
Our network achieves a big improvement in accuracy, demonstrating the effectiveness of our proposed method.
arXiv Detail & Related papers (2021-02-03T08:40:17Z) - Bilateral Grid Learning for Stereo Matching Networks [22.92443311789097]
We present a novel edge-preserving cost volume upsampling module based on the slicing operation in the learned bilateral grid.
The slicing layer is parameter-free, which allows us to obtain a high quality cost volume of high resolution.
We design a real-time network based on this module, which outperforms existing published real-time deep stereo matching networks.
arXiv Detail & Related papers (2021-01-01T09:08:01Z) - AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching [50.06646151004375]
A novel domain-adaptive pipeline called AdaStereo aims to align multi-level representations for deep stereo matching networks.
Our AdaStereo models achieve state-of-the-art cross-domain performance on multiple stereo benchmarks, including KITTI, Middlebury, ETH3D, and DrivingStereo.
arXiv Detail & Related papers (2020-04-09T16:15:13Z) - Resolution Adaptive Networks for Efficient Inference [53.04907454606711]
We propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs.
In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations.
High-resolution paths in the network maintain the capability to recognize the "hard" samples.
arXiv Detail & Related papers (2020-03-16T16:54:36Z) - ADRN: Attention-based Deep Residual Network for Hyperspectral Image
Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one.
Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.