HM4: Hidden Markov Model with Memory Management for Visual Place
Recognition
- URL: http://arxiv.org/abs/2011.00450v1
- Date: Sun, 1 Nov 2020 08:49:24 GMT
- Title: HM4: Hidden Markov Model with Memory Management for Visual Place
Recognition
- Authors: Anh-Dzung Doan, Yasir Latif, Tat-Jun Chin, Ian Reid
- Abstract summary: We develop a Hidden Markov Model approach for visual place recognition in autonomous driving.
Our algorithm, dubbed HM$4$, exploits temporal look-ahead to transfer promising candidate images between passive storage and active memory.
We show that this allows constant time and space inference for a fixed coverage area.
- Score: 54.051025148533554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual place recognition needs to be robust against appearance variability
due to natural and man-made causes. Training data collection should thus be an
ongoing process to allow continuous appearance changes to be recorded. However,
this creates an unboundedly-growing database that poses time and memory
scalability challenges for place recognition methods. To tackle the scalability
issue for visual place recognition in autonomous driving, we develop a Hidden
Markov Model approach with a two-tiered memory management. Our algorithm,
dubbed HM$^4$, exploits temporal look-ahead to transfer promising candidate
images between passive storage and active memory when needed. The inference
process takes into account both promising images and a coarse representations
of the full database. We show that this allows constant time and space
inference for a fixed coverage area. The coarse representations can also be
updated incrementally to absorb new data. To further reduce the memory
requirements, we derive a compact image representation inspired by Locality
Sensitive Hashing (LSH). Through experiments on real world data, we demonstrate
the excellent scalability and accuracy of the approach under appearance changes
and provide comparisons against state-of-the-art techniques.
Related papers
- Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - Register assisted aggregation for Visual Place Recognition [4.5476780843439535]
Visual Place Recognition (VPR) refers to the process of using computer vision to recognize the position of the current query image.
Previous methods often discarded useless features while uncontrolled discarding features that help improve recognition accuracy.
We propose a new feature aggregation method to address this issue. Specifically, in order to obtain global and local features that contain discriminative place information, we added some registers on top of the original image tokens.
arXiv Detail & Related papers (2024-05-19T11:36:52Z) - Unrecognizable Yet Identifiable: Image Distortion with Preserved Embeddings [22.338328674283062]
We introduce an innovative image transformation technique that renders facial images unrecognizable to the eye while maintaining their identifiability by neural network models.
The proposed methodology can be used in various artificial intelligence applications to distort the visual data and keep the derived features close.
We show that it is possible to build the distortion that changes the image content by more than 70% while maintaining the same recognition accuracy.
arXiv Detail & Related papers (2024-01-26T18:20:53Z) - Rethinking Exemplars for Continual Semantic Segmentation in Endoscopy
Scenes: Entropy-based Mini-Batch Pseudo-Replay [18.383604936008744]
Endoscopy is a widely used technique for the early detection of diseases or robotic-assisted minimally invasive surgery (RMIS)
Existing deep learning (DL) models may suffer from catastrophic forgetting.
Data privacy and storage issues may lead to the unavailability of old data when updating the model.
We propose a Endoscopy Continual Semantic (EndoCSS) framework that does not involve the storage and privacy issues of data.
arXiv Detail & Related papers (2023-08-27T13:07:44Z) - Black-box Unsupervised Domain Adaptation with Bi-directional
Atkinson-Shiffrin Memory [59.51934126717572]
Black-box unsupervised domain adaptation (UDA) learns with source predictions of target data without accessing either source data or source models during training.
We propose BiMem, a bi-directional memorization mechanism that learns to remember useful and representative information to correct noisy pseudo labels on the fly.
BiMem achieves superior domain adaptation performance consistently across various visual recognition tasks such as image classification, semantic segmentation and object detection.
arXiv Detail & Related papers (2023-08-25T08:06:48Z) - Improving Image Recognition by Retrieving from Web-Scale Image-Text Data [68.63453336523318]
We introduce an attention-based memory module, which learns the importance of each retrieved example from the memory.
Compared to existing approaches, our method removes the influence of the irrelevant retrieved examples, and retains those that are beneficial to the input query.
We show that it achieves state-of-the-art accuracies in ImageNet-LT, Places-LT and Webvision datasets.
arXiv Detail & Related papers (2023-04-11T12:12:05Z) - Unsupervised Feature Learning for Event Data: Direct vs Inverse Problem
Formulation [53.850686395708905]
Event-based cameras record an asynchronous stream of per-pixel brightness changes.
In this paper, we focus on single-layer architectures for representation learning from event data.
We show improvements of up to 9 % in the recognition accuracy compared to the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-23T10:40:03Z) - Learning Invariant Representations for Reinforcement Learning without
Reconstruction [98.33235415273562]
We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction.
Bisimulation metrics quantify behavioral similarity between states in continuous MDPs.
We demonstrate the effectiveness of our method at disregarding task-irrelevant information using modified visual MuJoCo tasks.
arXiv Detail & Related papers (2020-06-18T17:59:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.