Related papers: Design Space Exploration of Low-Bit Quantized Neural Networks for Visual Place Recognition

Design Space Exploration of Low-Bit Quantized Neural Networks for Visual Place Recognition

URL: http://arxiv.org/abs/2312.09028v1
Date: Thu, 14 Dec 2023 15:24:42 GMT
Title: Design Space Exploration of Low-Bit Quantized Neural Networks for Visual Place Recognition
Authors: Oliver Grainge, Michael Milford, Indu Bodala, Sarvapali D. Ramchurn and Shoaib Ehsan
Abstract summary: Visual Place Recognition (VPR) is a critical task for performing global re-localization in visual perception systems. Recently new works have focused on the recall@1 metric as a performance measure with limited focus on resource utilization. This has resulted in methods that use deep learning models too large to deploy on low powered edge devices. We study the impact of compact convolutional network architecture design in combination with full-precision and mixed-precision post-training quantization on VPR performance.
Score: 26.213493552442102
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visual Place Recognition (VPR) is a critical task for performing global re-localization in visual perception systems. It requires the ability to accurately recognize a previously visited location under variations such as illumination, occlusion, appearance and viewpoint. In the case of robotic systems and augmented reality, the target devices for deployment are battery powered edge devices. Therefore whilst the accuracy of VPR methods is important so too is memory consumption and latency. Recently new works have focused on the recall@1 metric as a performance measure with limited focus on resource utilization. This has resulted in methods that use deep learning models too large to deploy on low powered edge devices. We hypothesize that these large models are highly over-parameterized and can be optimized to satisfy the constraints of a low powered embedded system whilst maintaining high recall performance. Our work studies the impact of compact convolutional network architecture design in combination with full-precision and mixed-precision post-training quantization on VPR performance. Importantly we not only measure performance via the recall@1 score but also measure memory consumption and latency. We characterize the design implications on memory, latency and recall scores and provide a number of design recommendations for VPR systems under these resource limitations.

Related papers

Quantifying Memory Utilization with Effective State-Size [73.52115209375343]
We develop a measure of textitmemory utilization' This metric is tailored to the fundamental class of systems with textitinput-invariant and textitinput-varying linear operators
arXiv Detail & Related papers (2025-04-28T08:12:30Z)
Memory Efficient Continual Learning for Edge-Based Visual Anomaly Detection [4.790817958353412]
We present a novel investigation into the problem of Continual Learning for Visual Anomaly Detection on edge devices. We evaluate the STFPM approach, given its low memory footprint on edge devices, which demonstrates good performance when combined with the Replay approach. Our study proves the feasibility of deploying VAD models that adapt and learn incrementally on CLAD scenarios on resource-constrained edge devices.
arXiv Detail & Related papers (2025-03-04T15:03:47Z)
SelaVPR++: Towards Seamless Adaptation of Foundation Models for Efficient Place Recognition [69.58329995485158]
Recent studies show that the visual place recognition (VPR) method using pre-trained visual foundation models can achieve promising performance. We propose a novel method to realize seamless adaptation of foundation models to VPR. In pursuit of higher efficiency and better performance, we propose an extension of the SelaVPR, called SelaVPR++.
arXiv Detail & Related papers (2025-02-23T15:01:09Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
PaSTe: Improving the Efficiency of Visual Anomaly Detection at the Edge [6.643376250301589]
Visual Anomaly Detection (VAD) has gained significant research attention for its ability to identify anomalous images and pinpoint the specific areas responsible for the anomaly. Despite its potential for real-world applications, the literature has given limited focus to resource-efficient VAD, particularly for deployment on edge devices. This work addresses this gap by leveraging lightweight neural networks to reduce memory and requirements, enabling VAD deployment on resource-constrained edge devices.
arXiv Detail & Related papers (2024-10-15T13:25:43Z)
Structured Pruning for Efficient Visual Place Recognition [24.433604332415204]
Visual Place Recognition (VPR) is fundamental for the global re-localization of robots and devices. Our work introduces a novel structured pruning method to streamline common VPR architectures. This dual focus significantly enhances the efficiency of the system, reducing both map and model memory requirements and decreasing feature extraction and retrieval latencies.
arXiv Detail & Related papers (2024-09-12T08:32:25Z)
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models [96.76995840807615]
HiRes-LLaVA is a novel framework designed to process any size of high-resolution input without altering the original contextual and geometric information. HiRes-LLaVA comprises two innovative components: (i) a SliceRestore adapter that reconstructs sliced patches into their original form, efficiently extracting both global and local features via down-up-sampling and convolution layers, and (ii) a Self-Mining Sampler to compress the vision tokens based on themselves.
arXiv Detail & Related papers (2024-07-11T17:42:17Z)
Simple linear attention language models balance the recall-throughput tradeoff [60.06020449520365]
We propose BASED, a simple architecture combining linear and sliding window attention. We train language models up to 1.3b parameters and show that BASED matches the strongest sub-quadratic models in perplexity and outperforms them on real-world recall-intensive tasks by 6.22 accuracy points.
arXiv Detail & Related papers (2024-02-28T19:28:27Z)
Compressing the Backward Pass of Large-Scale Neural Architectures by Structured Activation Pruning [0.0]
Sparsity in Deep Neural Networks (DNNs) has gained attention as a solution. This work focuses on ephemeral sparsity, aiming to reduce memory consumption during training. We report the effectiveness of activation pruning by evaluating training speed, accuracy, and memory usage of large-scale neural architectures.
arXiv Detail & Related papers (2023-11-28T15:31:31Z)
LGC-Net: A Lightweight Gyroscope Calibration Network for Efficient Attitude Estimation [10.468378902106613]
We present a calibration neural network model for denoising low-cost microelectromechanical system (MEMS) gyroscope and estimating the attitude of a robot in real-time. Key idea is extracting local and global features from the time window of inertial measurement units (IMU) measurements to regress the output compensation components for the gyroscope dynamically. The proposed algorithm is evaluated in the EuRoC and TUM-VI datasets and achieves state-of-the-art on the (unseen) test sequences with a more lightweight model structure.
arXiv Detail & Related papers (2022-09-19T08:03:03Z)
Incremental Online Learning Algorithms Comparison for Gesture and Visual Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification. Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z)
Braille Letter Reading: A Benchmark for Spatio-Temporal Pattern Recognition on Neuromorphic Hardware [50.380319968947035]
Recent deep learning approaches have reached accuracy in such tasks, but their implementation on conventional embedded solutions is still computationally very and energy expensive. We propose a new benchmark for computing tactile pattern recognition at the edge through letters reading. We trained and compared feed-forward and recurrent spiking neural networks (SNNs) offline using back-propagation through time with surrogate gradients, then we deployed them on the Intel Loihimorphic chip for efficient inference. Our results show that the LSTM outperforms the recurrent SNN in terms of accuracy by 14%. However, the recurrent SNN on Loihi is 237 times more energy
arXiv Detail & Related papers (2022-05-30T14:30:45Z)
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER) SEER is a simple modification of existing off-policy deep reinforcement learning methods. We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z)
Neural BRDF Representation and Importance Sampling [79.84316447473873]
We present a compact neural network-based representation of reflectance BRDF data. We encode BRDFs as lightweight networks, and propose a training scheme with adaptive angular sampling. We evaluate encoding results on isotropic and anisotropic BRDFs from multiple real-world datasets.
arXiv Detail & Related papers (2021-02-11T12:00:24Z)
Binary Neural Networks for Memory-Efficient and Effective Visual Place Recognition in Changing Environments [24.674034243725455]
Visual place recognition (VPR) is a robot's ability to determine whether a place was visited before using visual data. CNN-based approaches are unsuitable for resource-constrained platforms, such as small robots and drones. We propose a new class of highly compact models that drastically reduces the memory requirements and computational effort.
arXiv Detail & Related papers (2020-10-01T22:59:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.