Related papers: HM4: Hidden Markov Model with Memory Management for Visual Place Recognition

HM4: Hidden Markov Model with Memory Management for Visual Place Recognition

URL: http://arxiv.org/abs/2011.00450v1
Date: Sun, 1 Nov 2020 08:49:24 GMT
Title: HM4: Hidden Markov Model with Memory Management for Visual Place Recognition
Authors: Anh-Dzung Doan, Yasir Latif, Tat-Jun Chin, Ian Reid
Abstract summary: We develop a Hidden Markov Model approach for visual place recognition in autonomous driving. Our algorithm, dubbed HM$4$, exploits temporal look-ahead to transfer promising candidate images between passive storage and active memory. We show that this allows constant time and space inference for a fixed coverage area.
Score: 54.051025148533554
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visual place recognition needs to be robust against appearance variability due to natural and man-made causes. Training data collection should thus be an ongoing process to allow continuous appearance changes to be recorded. However, this creates an unboundedly-growing database that poses time and memory scalability challenges for place recognition methods. To tackle the scalability issue for visual place recognition in autonomous driving, we develop a Hidden Markov Model approach with a two-tiered memory management. Our algorithm, dubbed HM$^4$, exploits temporal look-ahead to transfer promising candidate images between passive storage and active memory when needed. The inference process takes into account both promising images and a coarse representations of the full database. We show that this allows constant time and space inference for a fixed coverage area. The coarse representations can also be updated incrementally to absorb new data. To further reduce the memory requirements, we derive a compact image representation inspired by Locality Sensitive Hashing (LSH). Through experiments on real world data, we demonstrate the excellent scalability and accuracy of the approach under appearance changes and provide comparisons against state-of-the-art techniques.

Related papers

SChanger: Change Detection from a Semantic Change and Spatial Consistency Perspective [0.6749750044497732]
We develop a fine-tuning strategy called the Semantic Change Network (SCN) to address the data scarcity issue. We observe that the locations of changes between the two images are spatially identical, a concept we refer to as spatial consistency. This enhances the modeling of multi-scale changes and helps capture underlying relationships in change detection semantics.
arXiv Detail & Related papers (2025-03-26T17:15:43Z)
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term. We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents. Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z)
Register assisted aggregation for Visual Place Recognition [4.5476780843439535]
Visual Place Recognition (VPR) refers to the process of using computer vision to recognize the position of the current query image. Previous methods often discarded useless features while uncontrolled discarding features that help improve recognition accuracy. We propose a new feature aggregation method to address this issue. Specifically, in order to obtain global and local features that contain discriminative place information, we added some registers on top of the original image tokens.
arXiv Detail & Related papers (2024-05-19T11:36:52Z)
Unrecognizable Yet Identifiable: Image Distortion with Preserved Embeddings [22.338328674283062]
We introduce an innovative image transformation technique that renders facial images unrecognizable to the eye while maintaining their identifiability by neural network models. The proposed methodology can be used in various artificial intelligence applications to distort the visual data and keep the derived features close. We show that it is possible to build the distortion that changes the image content by more than 70% while maintaining the same recognition accuracy.
arXiv Detail & Related papers (2024-01-26T18:20:53Z)
Rethinking Exemplars for Continual Semantic Segmentation in Endoscopy Scenes: Entropy-based Mini-Batch Pseudo-Replay [18.383604936008744]
Endoscopy is a widely used technique for the early detection of diseases or robotic-assisted minimally invasive surgery (RMIS) Existing deep learning (DL) models may suffer from catastrophic forgetting. Data privacy and storage issues may lead to the unavailability of old data when updating the model. We propose a Endoscopy Continual Semantic (EndoCSS) framework that does not involve the storage and privacy issues of data.
arXiv Detail & Related papers (2023-08-27T13:07:44Z)
Black-box Unsupervised Domain Adaptation with Bi-directional Atkinson-Shiffrin Memory [59.51934126717572]
Black-box unsupervised domain adaptation (UDA) learns with source predictions of target data without accessing either source data or source models during training. We propose BiMem, a bi-directional memorization mechanism that learns to remember useful and representative information to correct noisy pseudo labels on the fly. BiMem achieves superior domain adaptation performance consistently across various visual recognition tasks such as image classification, semantic segmentation and object detection.
arXiv Detail & Related papers (2023-08-25T08:06:48Z)
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data [68.63453336523318]
We introduce an attention-based memory module, which learns the importance of each retrieved example from the memory. Compared to existing approaches, our method removes the influence of the irrelevant retrieved examples, and retains those that are beneficial to the input query. We show that it achieves state-of-the-art accuracies in ImageNet-LT, Places-LT and Webvision datasets.
arXiv Detail & Related papers (2023-04-11T12:12:05Z)
Unsupervised Feature Learning for Event Data: Direct vs Inverse Problem Formulation [53.850686395708905]
Event-based cameras record an asynchronous stream of per-pixel brightness changes. In this paper, we focus on single-layer architectures for representation learning from event data. We show improvements of up to 9 % in the recognition accuracy compared to the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-23T10:40:03Z)
Learning Invariant Representations for Reinforcement Learning without Reconstruction [98.33235415273562]
We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction. Bisimulation metrics quantify behavioral similarity between states in continuous MDPs. We demonstrate the effectiveness of our method at disregarding task-irrelevant information using modified visual MuJoCo tasks.
arXiv Detail & Related papers (2020-06-18T17:59:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.