Bilateral Grid Learning for Stereo Matching Networks
- URL: http://arxiv.org/abs/2101.01601v2
- Date: Mon, 29 Mar 2021 02:43:10 GMT
- Title: Bilateral Grid Learning for Stereo Matching Networks
- Authors: Bin Xu, Yuhua Xu, Xiaoli Yang, Wei Jia, Yulan Guo
- Abstract summary: We present a novel edge-preserving cost volume upsampling module based on the slicing operation in the learned bilateral grid.
The slicing layer is parameter-free, which allows us to obtain a high quality cost volume of high resolution.
We design a real-time network based on this module, which outperforms existing published real-time deep stereo matching networks.
- Score: 22.92443311789097
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time performance of stereo matching networks is important for many
applications, such as automatic driving, robot navigation and augmented reality
(AR). Although significant progress has been made in stereo matching networks
in recent years, it is still challenging to balance real-time performance and
accuracy. In this paper, we present a novel edge-preserving cost volume
upsampling module based on the slicing operation in the learned bilateral grid.
The slicing layer is parameter-free, which allows us to obtain a high quality
cost volume of high resolution from a low-resolution cost volume under the
guide of the learned guidance map efficiently. The proposed cost volume
upsampling module can be seamlessly embedded into many existing stereo matching
networks, such as GCNet, PSMNet, and GANet. The resulting networks are
accelerated several times while maintaining comparable accuracy. Furthermore,
we design a real-time network (named BGNet) based on this module, which
outperforms existing published real-time deep stereo matching networks, as well
as some complex networks on the KITTI stereo datasets. The code is available at
https://github.com/YuhuaXu/BGNet.
Related papers
- ACVNet: Attention Concatenation Volume for Accurate and Efficient Stereo
Matching [7.39503547452922]
We present a novel cost volume construction method which generates attention weights from correlation clues to suppress redundant information.
To generate reliable attention weights, we propose multi-level adaptive patch matching to improve the distinctiveness of the matching cost.
The proposed cost volume is named attention concatenation volume (ACV) which can be seamlessly embedded into most stereo matching networks.
arXiv Detail & Related papers (2022-03-04T06:28:58Z) - Dynamic Slimmable Denoising Network [64.77565006158895]
Dynamic slimmable denoising network (DDSNet) is a general method to achieve good denoising quality with less computational complexity.
OurNet is empowered with the ability of dynamic inference by a dynamic gate.
Our experiments demonstrate our-Net consistently outperforms the state-of-the-art individually trained static denoising networks.
arXiv Detail & Related papers (2021-10-17T22:45:33Z) - AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network [8.127449025802436]
We present a novel recurrent multi-view stereo network based on long short-term memory (LSTM) with adaptive aggregation, namely AA-RMVSNet.
We firstly introduce an intra-view aggregation module to adaptively extract image features by using context-aware convolution and multi-scale aggregation.
We propose an inter-view cost volume aggregation module for adaptive pixel-wise view aggregation, which is able to preserve better-matched pairs among all views.
arXiv Detail & Related papers (2021-08-09T06:10:48Z) - MutualNet: Adaptive ConvNet via Mutual Learning from Different Model
Configurations [51.85020143716815]
We propose MutualNet to train a single network that can run at a diverse set of resource constraints.
Our method trains a cohort of model configurations with various network widths and input resolutions.
MutualNet is a general training methodology that can be applied to various network structures.
arXiv Detail & Related papers (2021-05-14T22:30:13Z) - Bit-Mixer: Mixed-precision networks with runtime bit-width selection [72.32693989093558]
Bit-Mixer is the first method to train a meta-quantized network where during test time any layer can change its bid-width without affecting the overall network's ability for highly accurate inference.
We show that our method can result in mixed precision networks that exhibit the desirable flexibility properties for on-device deployment without compromising accuracy.
arXiv Detail & Related papers (2021-03-31T17:58:47Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - ES-Net: An Efficient Stereo Matching Network [4.8986598953553555]
Existing stereo matching networks typically use slow and computationally expensive 3D convolutions to improve the performance.
We propose the Efficient Stereo Network (ESNet), which achieves high performance and efficient inference at the same time.
arXiv Detail & Related papers (2021-03-05T20:11:39Z) - Multi-Scale Cost Volumes Cascade Network for Stereo Matching [9.440848600106797]
We propose MSCVNet, which combines traditional methods and CNN to improve the quality of cost volume.
Our network achieves a big improvement in accuracy, demonstrating the effectiveness of our proposed method.
arXiv Detail & Related papers (2021-02-03T08:40:17Z) - Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency
Checking [54.58791377183574]
Our novel hybrid recurrent multi-view stereo net consists of two core modules: 1) a light DRENet (Dense Reception Expanded) module to extract dense feature maps of original size with multi-scale context information, 2) a HU-LSTM (Hybrid U-LSTM) to regularize 3D matching volume into predicted depth map.
Our method exhibits competitive performance to the state-of-the-art method while dramatically reduces memory consumption, which costs only $19.4%$ of R-MVSNet memory consumption.
arXiv Detail & Related papers (2020-07-21T14:59:59Z) - Continual Adaptation for Deep Stereo [52.181067640300014]
We propose a continual adaptation paradigm for deep stereo networks designed to deal with challenging and ever-changing environments.
In our paradigm, the learning signals needed to continuously adapt models online can be sourced from self-supervision via right-to-left image warping or from traditional stereo algorithms.
Our network architecture and adaptation algorithms realize the first real-time self-adaptive deep stereo system.
arXiv Detail & Related papers (2020-07-10T08:15:58Z) - AANet: Adaptive Aggregation Network for Efficient Stereo Matching [33.39794232337985]
Current state-of-the-art stereo models are mostly based on costly 3D convolutions.
We propose a sparse points based intra-scale cost aggregation method to alleviate the edge-fattening issue.
We also approximate traditional cross-scale cost aggregation algorithm with neural network layers to handle large textureless regions.
arXiv Detail & Related papers (2020-04-20T18:07:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.