Multi-Scale Cost Volumes Cascade Network for Stereo Matching
- URL: http://arxiv.org/abs/2102.01940v1
- Date: Wed, 3 Feb 2021 08:40:17 GMT
- Title: Multi-Scale Cost Volumes Cascade Network for Stereo Matching
- Authors: Xiaogang Jia, Wei Chen, Zhengfa Liang, Yusong Tan, Mingfei Wu
- Abstract summary: We propose MSCVNet, which combines traditional methods and CNN to improve the quality of cost volume.
Our network achieves a big improvement in accuracy, demonstrating the effectiveness of our proposed method.
- Score: 9.440848600106797
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stereo matching is essential for robot navigation. However, the accuracy of
current widely used traditional methods is low, while methods based on CNN need
expensive computational cost and running time. This is because different cost
volumes play a crucial role in balancing speed and accuracy. Thus we propose
MSCVNet, which combines traditional methods and CNN to improve the quality of
cost volume. Concretely, our network first generates multiple 3D cost volumes
with different resolutions and then uses 2D convolutions to construct a novel
cascade hourglass network for cost aggregation. Meanwhile, we design an
algorithm to distinguish and calculate the loss for discontinuous areas of
disparity result. According to the KITTI official website, our network is much
faster than most top-performing methods(24*than CSPN, 44*than GANet, etc.).
Meanwhile, compared to traditional methods(SPS-St, SGM) and other real-time
stereo matching networks(Fast DS-CS, DispNetC, and RTSNet, etc.), our network
achieves a big improvement in accuracy, demonstrating the effectiveness of our
proposed method.
Related papers
- ACVNet: Attention Concatenation Volume for Accurate and Efficient Stereo
Matching [7.39503547452922]
We present a novel cost volume construction method which generates attention weights from correlation clues to suppress redundant information.
To generate reliable attention weights, we propose multi-level adaptive patch matching to improve the distinctiveness of the matching cost.
The proposed cost volume is named attention concatenation volume (ACV) which can be seamlessly embedded into most stereo matching networks.
arXiv Detail & Related papers (2022-03-04T06:28:58Z) - Multi-scale Iterative Residuals for Fast and Scalable Stereo Matching [13.76996108304056]
This paper presents an iterative multi-scale coarse-to-fine refinement (iCFR) framework to bridge this gap.
We use multi-scale warped features to estimate disparity residuals and push the disparity search range in the cost volume to a minimum limit.
Finally, we apply a refinement network to recover the loss of precision which is inherent in multi-scale approaches.
arXiv Detail & Related papers (2021-10-25T09:54:17Z) - DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels.
We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z) - OMPQ: Orthogonal Mixed Precision Quantization [64.59700856607017]
Mixed precision quantization takes advantage of hardware's multiple bit-width arithmetic operations to unleash the full potential of network quantization.
We propose to optimize a proxy metric, the concept of networkity, which is highly correlated with the loss of the integer programming.
This approach reduces the search time and required data amount by orders of magnitude, with little compromise on quantization accuracy.
arXiv Detail & Related papers (2021-09-16T10:59:33Z) - Greedy Network Enlarging [53.319011626986004]
We propose a greedy network enlarging method based on the reallocation of computations.
With step-by-step modifying the computations on different stages, the enlarged network will be equipped with optimal allocation and utilization of MACs.
With application of our method on GhostNet, we achieve state-of-the-art 80.9% and 84.3% ImageNet top-1 accuracies.
arXiv Detail & Related papers (2021-07-31T08:36:30Z) - MutualNet: Adaptive ConvNet via Mutual Learning from Different Model
Configurations [51.85020143716815]
We propose MutualNet to train a single network that can run at a diverse set of resource constraints.
Our method trains a cohort of model configurations with various network widths and input resolutions.
MutualNet is a general training methodology that can be applied to various network structures.
arXiv Detail & Related papers (2021-05-14T22:30:13Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - ES-Net: An Efficient Stereo Matching Network [4.8986598953553555]
Existing stereo matching networks typically use slow and computationally expensive 3D convolutions to improve the performance.
We propose the Efficient Stereo Network (ESNet), which achieves high performance and efficient inference at the same time.
arXiv Detail & Related papers (2021-03-05T20:11:39Z) - Bilateral Grid Learning for Stereo Matching Networks [22.92443311789097]
We present a novel edge-preserving cost volume upsampling module based on the slicing operation in the learned bilateral grid.
The slicing layer is parameter-free, which allows us to obtain a high quality cost volume of high resolution.
We design a real-time network based on this module, which outperforms existing published real-time deep stereo matching networks.
arXiv Detail & Related papers (2021-01-01T09:08:01Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - AANet: Adaptive Aggregation Network for Efficient Stereo Matching [33.39794232337985]
Current state-of-the-art stereo models are mostly based on costly 3D convolutions.
We propose a sparse points based intra-scale cost aggregation method to alleviate the edge-fattening issue.
We also approximate traditional cross-scale cost aggregation algorithm with neural network layers to handle large textureless regions.
arXiv Detail & Related papers (2020-04-20T18:07:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.