Design Space Exploration of Low-Bit Quantized Neural Networks for Visual
Place Recognition
- URL: http://arxiv.org/abs/2312.09028v1
- Date: Thu, 14 Dec 2023 15:24:42 GMT
- Title: Design Space Exploration of Low-Bit Quantized Neural Networks for Visual
Place Recognition
- Authors: Oliver Grainge, Michael Milford, Indu Bodala, Sarvapali D. Ramchurn
and Shoaib Ehsan
- Abstract summary: Visual Place Recognition (VPR) is a critical task for performing global re-localization in visual perception systems.
Recently new works have focused on the recall@1 metric as a performance measure with limited focus on resource utilization.
This has resulted in methods that use deep learning models too large to deploy on low powered edge devices.
We study the impact of compact convolutional network architecture design in combination with full-precision and mixed-precision post-training quantization on VPR performance.
- Score: 26.213493552442102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual Place Recognition (VPR) is a critical task for performing global
re-localization in visual perception systems. It requires the ability to
accurately recognize a previously visited location under variations such as
illumination, occlusion, appearance and viewpoint. In the case of robotic
systems and augmented reality, the target devices for deployment are battery
powered edge devices. Therefore whilst the accuracy of VPR methods is important
so too is memory consumption and latency. Recently new works have focused on
the recall@1 metric as a performance measure with limited focus on resource
utilization. This has resulted in methods that use deep learning models too
large to deploy on low powered edge devices. We hypothesize that these large
models are highly over-parameterized and can be optimized to satisfy the
constraints of a low powered embedded system whilst maintaining high recall
performance. Our work studies the impact of compact convolutional network
architecture design in combination with full-precision and mixed-precision
post-training quantization on VPR performance. Importantly we not only measure
performance via the recall@1 score but also measure memory consumption and
latency. We characterize the design implications on memory, latency and recall
scores and provide a number of design recommendations for VPR systems under
these resource limitations.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - PaSTe: Improving the Efficiency of Visual Anomaly Detection at the Edge [6.643376250301589]
Visual Anomaly Detection (VAD) has gained significant research attention for its ability to identify anomalous images and pinpoint the specific areas responsible for the anomaly.
Despite its potential for real-world applications, the literature has given limited focus to resource-efficient VAD, particularly for deployment on edge devices.
This work addresses this gap by leveraging lightweight neural networks to reduce memory and requirements, enabling VAD deployment on resource-constrained edge devices.
arXiv Detail & Related papers (2024-10-15T13:25:43Z) - Structured Pruning for Efficient Visual Place Recognition [24.433604332415204]
Visual Place Recognition (VPR) is fundamental for the global re-localization of robots and devices.
Our work introduces a novel structured pruning method to streamline common VPR architectures.
This dual focus significantly enhances the efficiency of the system, reducing both map and model memory requirements and decreasing feature extraction and retrieval latencies.
arXiv Detail & Related papers (2024-09-12T08:32:25Z) - HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models [96.76995840807615]
HiRes-LLaVA is a novel framework designed to process any size of high-resolution input without altering the original contextual and geometric information.
HiRes-LLaVA comprises two innovative components: (i) a SliceRestore adapter that reconstructs sliced patches into their original form, efficiently extracting both global and local features via down-up-sampling and convolution layers, and (ii) a Self-Mining Sampler to compress the vision tokens based on themselves.
arXiv Detail & Related papers (2024-07-11T17:42:17Z) - Compressing the Backward Pass of Large-Scale Neural Architectures by
Structured Activation Pruning [0.0]
Sparsity in Deep Neural Networks (DNNs) has gained attention as a solution.
This work focuses on ephemeral sparsity, aiming to reduce memory consumption during training.
We report the effectiveness of activation pruning by evaluating training speed, accuracy, and memory usage of large-scale neural architectures.
arXiv Detail & Related papers (2023-11-28T15:31:31Z) - LGC-Net: A Lightweight Gyroscope Calibration Network for Efficient
Attitude Estimation [10.468378902106613]
We present a calibration neural network model for denoising low-cost microelectromechanical system (MEMS) gyroscope and estimating the attitude of a robot in real-time.
Key idea is extracting local and global features from the time window of inertial measurement units (IMU) measurements to regress the output compensation components for the gyroscope dynamically.
The proposed algorithm is evaluated in the EuRoC and TUM-VI datasets and achieves state-of-the-art on the (unseen) test sequences with a more lightweight model structure.
arXiv Detail & Related papers (2022-09-19T08:03:03Z) - Incremental Online Learning Algorithms Comparison for Gesture and Visual
Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification.
Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z) - Braille Letter Reading: A Benchmark for Spatio-Temporal Pattern
Recognition on Neuromorphic Hardware [50.380319968947035]
Recent deep learning approaches have reached accuracy in such tasks, but their implementation on conventional embedded solutions is still computationally very and energy expensive.
We propose a new benchmark for computing tactile pattern recognition at the edge through letters reading.
We trained and compared feed-forward and recurrent spiking neural networks (SNNs) offline using back-propagation through time with surrogate gradients, then we deployed them on the Intel Loihimorphic chip for efficient inference.
Our results show that the LSTM outperforms the recurrent SNN in terms of accuracy by 14%. However, the recurrent SNN on Loihi is 237 times more energy
arXiv Detail & Related papers (2022-05-30T14:30:45Z) - Improving Computational Efficiency in Visual Reinforcement Learning via
Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER)
SEER is a simple modification of existing off-policy deep reinforcement learning methods.
We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z) - Neural BRDF Representation and Importance Sampling [79.84316447473873]
We present a compact neural network-based representation of reflectance BRDF data.
We encode BRDFs as lightweight networks, and propose a training scheme with adaptive angular sampling.
We evaluate encoding results on isotropic and anisotropic BRDFs from multiple real-world datasets.
arXiv Detail & Related papers (2021-02-11T12:00:24Z) - Binary Neural Networks for Memory-Efficient and Effective Visual Place
Recognition in Changing Environments [24.674034243725455]
Visual place recognition (VPR) is a robot's ability to determine whether a place was visited before using visual data.
CNN-based approaches are unsuitable for resource-constrained platforms, such as small robots and drones.
We propose a new class of highly compact models that drastically reduces the memory requirements and computational effort.
arXiv Detail & Related papers (2020-10-01T22:59:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.