Merging Classification Predictions with Sequential Information for
Lightweight Visual Place Recognition in Changing Environments
- URL: http://arxiv.org/abs/2210.00834v1
- Date: Mon, 3 Oct 2022 11:42:08 GMT
- Title: Merging Classification Predictions with Sequential Information for
Lightweight Visual Place Recognition in Changing Environments
- Authors: Bruno Arcanjo, Bruno Ferrarini, Michael Milford, Klaus D.
McDonald-Maier and Shoaib Ehsan
- Abstract summary: Low-overhead visual place recognition (VPR) is a highly active research topic.
Mobile robotics applications often operate under low-end hardware, and even more hardware capable systems can still benefit from freeing up onboard system resources for other navigation tasks.
This work addresses lightweight VPR by proposing a novel system based on the combination of binary-weighted classifier networks with a one-dimensional convolutional network, dubbed merger.
- Score: 22.58641358408613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Low-overhead visual place recognition (VPR) is a highly active research
topic. Mobile robotics applications often operate under low-end hardware, and
even more hardware capable systems can still benefit from freeing up onboard
system resources for other navigation tasks. This work addresses lightweight
VPR by proposing a novel system based on the combination of binary-weighted
classifier networks with a one-dimensional convolutional network, dubbed
merger. Recent work in fusing multiple VPR techniques has mainly focused on
increasing VPR performance, with computational efficiency not being highly
prioritized. In contrast, we design our technique prioritizing low inference
times, taking inspiration from the machine learning literature where the
efficient combination of classifiers is a heavily researched topic. Our
experiments show that the merger achieves inference times as low as 1
millisecond, being significantly faster than other well-established lightweight
VPR techniques, while achieving comparable or superior VPR performance on
several visual changes such as seasonal variations and viewpoint lateral
shifts.
Related papers
- Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - ViR: Towards Efficient Vision Retention Backbones [97.93707844681893]
We propose a new class of computer vision models, dubbed Vision Retention Networks (ViR)
ViR has dual parallel and recurrent formulations, which strike an optimal balance between fast inference and parallel training with competitive performance.
We have validated the effectiveness of ViR through extensive experiments with different dataset sizes and various image resolutions.
arXiv Detail & Related papers (2023-10-30T16:55:50Z) - Improving Audio-Visual Speech Recognition by Lip-Subword Correlation
Based Visual Pre-training and Cross-Modal Fusion Encoder [58.523884148942166]
We propose two novel techniques to improve audio-visual speech recognition (AVSR) under a pre-training and fine-tuning training framework.
First, we explore the correlation between lip shapes and syllable-level subword units in Mandarin to establish good frame-level syllable boundaries from lip shapes.
Next, we propose an audio-guided cross-modal fusion encoder (CMFE) neural network to utilize main training parameters for multiple cross-modal attention layers.
arXiv Detail & Related papers (2023-08-14T08:19:24Z) - Dynamic Perceiver for Efficient Visual Recognition [87.08210214417309]
We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task.
A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks.
Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
arXiv Detail & Related papers (2023-06-20T03:00:22Z) - A-MuSIC: An Adaptive Ensemble System For Visual Place Recognition In
Changing Environments [22.58641358408613]
Visual place recognition (VPR) is an essential component of robot navigation and localization systems.
No single VPR technique excels in every environmental condition.
adaptive VPR system dubbed Adaptive Multi-Self Identification and Correction (A-MuSIC)
A-MuSIC matches or beats state-of-the-art VPR performance across all tested benchmark datasets.
arXiv Detail & Related papers (2023-03-24T19:25:22Z) - MixVPR: Feature Mixing for Visual Place Recognition [3.6739949215165164]
Visual Place Recognition (VPR) is a crucial part of mobile robotics and autonomous driving.
We introduce MixVPR, a new holistic feature aggregation technique that takes feature maps from pre-trained backbones as a set of global features.
We demonstrate the effectiveness of our technique through extensive experiments on multiple large-scale benchmarks.
arXiv Detail & Related papers (2023-03-03T19:24:03Z) - SwitchHit: A Probabilistic, Complementarity-Based Switching System for
Improved Visual Place Recognition in Changing Environments [20.917586014941033]
There is no universal VPR technique that can work in all types of environments.
Running multiple VPR techniques in parallel may be prohibitive for resource-constrained embedded platforms.
This paper presents a probabilistic complementarity based switching VPR system, SwitchHit.
arXiv Detail & Related papers (2022-03-01T16:23:22Z) - Video Coding for Machine: Compact Visual Representation Compression for
Intelligent Collaborative Analytics [101.35754364753409]
Video Coding for Machines (VCM) is committed to bridging to an extent separate research tracks of video/image compression and feature compression.
This paper summarizes VCM methodology and philosophy based on existing academia and industrial efforts.
arXiv Detail & Related papers (2021-10-18T12:42:13Z) - An Efficient and Scalable Collection of Fly-inspired Voting Units for
Visual Place Recognition in Changing Environments [20.485491385050615]
Low-overhead VPR techniques would enable platforms equipped with low-end, cheap hardware.
Our goal is to provide an algorithm of extreme compactness and efficiency while achieving state-of-the-art robustness to appearance changes and small point-of-view variations.
arXiv Detail & Related papers (2021-09-22T19:01:20Z) - Real-Time Visual Object Tracking via Few-Shot Learning [107.39695680340877]
Visual Object Tracking (VOT) can be seen as an extended task of Few-Shot Learning (FSL)
We propose a two-stage framework that is capable of employing a large variety of FSL algorithms while presenting faster adaptation speed.
Experiments on the major benchmarks, VOT2018, OTB2015, NFS, UAV123, TrackingNet, and GOT-10k are conducted, demonstrating a desirable performance gain and a real-time speed.
arXiv Detail & Related papers (2021-03-18T10:02:03Z) - VPR-Bench: An Open-Source Visual Place Recognition Evaluation Framework
with Quantifiable Viewpoint and Appearance Change [25.853640977526705]
VPR research has grown rapidly as a field over the past decade due to improving camera hardware and its potential for deep learning-based techniques.
This growth has led to fragmentation and a lack of standardisation in the field, especially concerning performance evaluation.
In this paper, we address these gaps through a new comprehensive open-source framework for assessing the performance of VPR techniques, dubbed "VPR-Bench"
arXiv Detail & Related papers (2020-05-17T00:27:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.