Related papers: StereoVAE: A lightweight stereo matching system through embedded GPUs

StereoVAE: A lightweight stereo matching system through embedded GPUs

URL: http://arxiv.org/abs/2305.11566v2
Date: Thu, 25 May 2023 05:24:43 GMT
Title: StereoVAE: A lightweight stereo matching system through embedded GPUs
Authors: Qiong Chang, Xiang Li, Xin Xu, Xin Liu, Yun Li and Miyazaki Jun
Abstract summary: We present a lightweight system for stereo matching through embedded GPUs. It breaks the trade-off between accuracy and processing speed in stereo matching, enabling our embedded system to further improve the matching accuracy while ensuring real-time processing.
Score: 13.338765413730743
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a lightweight system for stereo matching through embedded GPUs. It breaks the trade-off between accuracy and processing speed in stereo matching, enabling our embedded system to further improve the matching accuracy while ensuring real-time processing. The main idea of our method is to construct a tiny neural network based on variational auto-encoder (VAE) to upsample and refinement a small size of coarse disparity map, which is first generated by a traditional matching method. The proposed hybrid structure cannot only bring the advantage of traditional methods in terms of computational complexity, but also ensure the matching accuracy under the impact of neural network. Extensive experiments on the KITTI 2015 benchmark demonstrate that our tiny system exhibits high robustness in improving the accuracy of the coarse disparity maps generated by different algorithms, while also running in real-time on embedded GPUs.

Related papers

Federated Learning of Low-Rank One-Shot Image Detection Models in Edge Devices with Scalable Accuracy and Compute Complexity [5.820612543019548]
LoRa-FL is designed for training low-rank one-shot image detection models deployed on edge devices. By incorporating low-rank adaptation techniques into one-shot detection architectures, our method significantly reduces both computational and communication overhead.
arXiv Detail & Related papers (2025-04-23T08:40:44Z)
Robust and Real-time Surface Normal Estimation from Stereo Disparities using Affine Transformations [6.322193856514675]
This work introduces a novel method for surface normal estimation from rectified stereo image pairs. We develop a custom algorithm inspired by convolutional operations, tailored to process disparity data efficiently. Our method is validated using both simulated environments and real-world stereo images from the Middlebury and Cityscapes datasets.
arXiv Detail & Related papers (2025-04-21T14:19:00Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
SySMOL: Co-designing Algorithms and Hardware for Neural Networks with Heterogeneous Precisions [20.241671088121144]
Recent quantization techniques have enabled heterogeneous precisions at very fine granularity. These networks require additional hardware to decode the precision settings for individual variables, align the variables, and provide fine-grained mixed-precision compute capabilities. We present an end-to-end co-design approach to efficiently execute networks with fine-grained heterogeneous precisions.
arXiv Detail & Related papers (2023-11-23T17:20:09Z)
Design and Prototyping Distributed CNN Inference Acceleration in Edge Computing [85.74517957717363]
HALP accelerates inference by designing a seamless collaboration among edge devices (EDs) in Edge Computing. Experiments show that the distributed inference HALP achieves 1.7x inference acceleration for VGG-16. It is shown that the model selection with distributed inference HALP can significantly improve service reliability.
arXiv Detail & Related papers (2022-11-24T19:48:30Z)
Neural Disparity Refinement for Arbitrary Resolution Stereo [67.55946402652778]
We introduce a novel architecture for neural disparity refinement aimed at facilitating deployment of 3D computer vision on cheap and widespread consumer devices. Our approach relies on a continuous formulation that enables to estimate a refined disparity map at any arbitrary output resolution.
arXiv Detail & Related papers (2021-10-28T18:00:00Z)
End-to-end Neural Video Coding Using a Compound Spatiotemporal Representation [33.54844063875569]
We propose a hybrid motion compensation (HMC) method that adaptively combines the predictions generated by two approaches. Specifically, we generate a compoundtemporal representation (STR) through a recurrent information aggregation (RIA) module. We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements.
arXiv Detail & Related papers (2021-08-05T19:43:32Z)
Lightweight Convolutional Neural Network with Gaussian-based Grasping Representation for Robotic Grasping Detection [4.683939045230724]
Current object detectors are difficult to strike a balance between high accuracy and fast inference speed. We present an efficient and robust fully convolutional neural network model to perform robotic grasping pose estimation. The network is an order of magnitude smaller than other excellent algorithms.
arXiv Detail & Related papers (2021-01-25T16:36:53Z)
Adaptive Deconvolution-based stereo matching Net for Local Stereo Matching [11.214543038438055]
In deep learning-based local stereo matching methods, larger image patches usually bring better stereo matching accuracy. We propose an efficient CNN based structure, namely Adaptive Deconvolution-based disparity matching Net (ADSM net) Experimental results on the KITTI 2012 and 2015 datasets demonstrate that the proposed method can achieve a good trade-off between accuracy and complexity.
arXiv Detail & Related papers (2021-01-01T12:18:53Z)
Displacement-Invariant Cost Computation for Efficient Stereo Matching [122.94051630000934]
Deep learning methods have dominated stereo matching leaderboards by yielding unprecedented disparity accuracy. But their inference time is typically slow, on the order of seconds for a pair of 540p images. We propose a emphdisplacement-invariant cost module to compute the matching costs without needing a 4D feature volume.
arXiv Detail & Related papers (2020-12-01T23:58:16Z)
Training Sparse Neural Networks using Compressed Sensing [13.84396596420605]
We develop and test a novel method based on compressed sensing which combines the pruning and training into a single step. Specifically, we utilize an adaptively weighted $ell1$ penalty on the weights during training, which we combine with a generalization of the regularized dual averaging (RDA) algorithm in order to train sparse neural networks.
arXiv Detail & Related papers (2020-08-21T19:35:54Z)
Depthwise Non-local Module for Fast Salient Object Detection Using a Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection. The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)
An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly. Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.