StereoVAE: A lightweight stereo matching system through embedded GPUs
- URL: http://arxiv.org/abs/2305.11566v2
- Date: Thu, 25 May 2023 05:24:43 GMT
- Title: StereoVAE: A lightweight stereo matching system through embedded GPUs
- Authors: Qiong Chang, Xiang Li, Xin Xu, Xin Liu, Yun Li and Miyazaki Jun
- Abstract summary: We present a lightweight system for stereo matching through embedded GPUs.
It breaks the trade-off between accuracy and processing speed in stereo matching, enabling our embedded system to further improve the matching accuracy while ensuring real-time processing.
- Score: 13.338765413730743
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a lightweight system for stereo matching through embedded GPUs. It
breaks the trade-off between accuracy and processing speed in stereo matching,
enabling our embedded system to further improve the matching accuracy while
ensuring real-time processing. The main idea of our method is to construct a
tiny neural network based on variational auto-encoder (VAE) to upsample and
refinement a small size of coarse disparity map, which is first generated by a
traditional matching method. The proposed hybrid structure cannot only bring
the advantage of traditional methods in terms of computational complexity, but
also ensure the matching accuracy under the impact of neural network. Extensive
experiments on the KITTI 2015 benchmark demonstrate that our tiny system
exhibits high robustness in improving the accuracy of the coarse disparity maps
generated by different algorithms, while also running in real-time on embedded
GPUs.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - SySMOL: Co-designing Algorithms and Hardware for Neural Networks with Heterogeneous Precisions [20.241671088121144]
Recent quantization techniques have enabled heterogeneous precisions at very fine granularity.
These networks require additional hardware to decode the precision settings for individual variables, align the variables, and provide fine-grained mixed-precision compute capabilities.
We present an end-to-end co-design approach to efficiently execute networks with fine-grained heterogeneous precisions.
arXiv Detail & Related papers (2023-11-23T17:20:09Z) - Design and Prototyping Distributed CNN Inference Acceleration in Edge
Computing [85.74517957717363]
HALP accelerates inference by designing a seamless collaboration among edge devices (EDs) in Edge Computing.
Experiments show that the distributed inference HALP achieves 1.7x inference acceleration for VGG-16.
It is shown that the model selection with distributed inference HALP can significantly improve service reliability.
arXiv Detail & Related papers (2022-11-24T19:48:30Z) - Neural Disparity Refinement for Arbitrary Resolution Stereo [67.55946402652778]
We introduce a novel architecture for neural disparity refinement aimed at facilitating deployment of 3D computer vision on cheap and widespread consumer devices.
Our approach relies on a continuous formulation that enables to estimate a refined disparity map at any arbitrary output resolution.
arXiv Detail & Related papers (2021-10-28T18:00:00Z) - End-to-end Neural Video Coding Using a Compound Spatiotemporal
Representation [33.54844063875569]
We propose a hybrid motion compensation (HMC) method that adaptively combines the predictions generated by two approaches.
Specifically, we generate a compoundtemporal representation (STR) through a recurrent information aggregation (RIA) module.
We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements.
arXiv Detail & Related papers (2021-08-05T19:43:32Z) - Lightweight Convolutional Neural Network with Gaussian-based Grasping
Representation for Robotic Grasping Detection [4.683939045230724]
Current object detectors are difficult to strike a balance between high accuracy and fast inference speed.
We present an efficient and robust fully convolutional neural network model to perform robotic grasping pose estimation.
The network is an order of magnitude smaller than other excellent algorithms.
arXiv Detail & Related papers (2021-01-25T16:36:53Z) - Adaptive Deconvolution-based stereo matching Net for Local Stereo
Matching [11.214543038438055]
In deep learning-based local stereo matching methods, larger image patches usually bring better stereo matching accuracy.
We propose an efficient CNN based structure, namely Adaptive Deconvolution-based disparity matching Net (ADSM net)
Experimental results on the KITTI 2012 and 2015 datasets demonstrate that the proposed method can achieve a good trade-off between accuracy and complexity.
arXiv Detail & Related papers (2021-01-01T12:18:53Z) - Displacement-Invariant Cost Computation for Efficient Stereo Matching [122.94051630000934]
Deep learning methods have dominated stereo matching leaderboards by yielding unprecedented disparity accuracy.
But their inference time is typically slow, on the order of seconds for a pair of 540p images.
We propose a emphdisplacement-invariant cost module to compute the matching costs without needing a 4D feature volume.
arXiv Detail & Related papers (2020-12-01T23:58:16Z) - Training Sparse Neural Networks using Compressed Sensing [13.84396596420605]
We develop and test a novel method based on compressed sensing which combines the pruning and training into a single step.
Specifically, we utilize an adaptively weighted $ell1$ penalty on the weights during training, which we combine with a generalization of the regularized dual averaging (RDA) algorithm in order to train sparse neural networks.
arXiv Detail & Related papers (2020-08-21T19:35:54Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z) - An Image Enhancing Pattern-based Sparsity for Real-time Inference on
Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly.
Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.