Unsupervised Complementary-aware Multi-process Fusion for Visual Place
Recognition
- URL: http://arxiv.org/abs/2112.04701v1
- Date: Thu, 9 Dec 2021 04:57:33 GMT
- Title: Unsupervised Complementary-aware Multi-process Fusion for Visual Place
Recognition
- Authors: Stephen Hausler, Tobias Fischer and Michael Milford
- Abstract summary: We propose an unsupervised algorithm that finds the most robust set of VPR techniques to use in the current deployment environment.
The proposed dynamic multi-process fusion (Dyn-MPF) has superior VPR performance compared to a variety of challenging competitive methods.
- Score: 28.235055888073855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A recent approach to the Visual Place Recognition (VPR) problem has been to
fuse the place recognition estimates of multiple complementary VPR techniques
simultaneously. However, selecting the optimal set of techniques to use in a
specific deployment environment a-priori is a difficult and unresolved
challenge. Further, to the best of our knowledge, no method exists which can
select a set of techniques on a frame-by-frame basis in response to
image-to-image variations. In this work, we propose an unsupervised algorithm
that finds the most robust set of VPR techniques to use in the current
deployment environment, on a frame-by-frame basis. The selection of techniques
is determined by an analysis of the similarity scores between the current query
image and the collection of database images and does not require ground-truth
information. We demonstrate our approach on a wide variety of datasets and VPR
techniques and show that the proposed dynamic multi-process fusion (Dyn-MPF)
has superior VPR performance compared to a variety of challenging competitive
methods, some of which are given an unfair advantage through access to the
ground-truth information.
Related papers
- A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding [76.44979557843367]
We propose a novel multi-view stereo (MVS) framework that gets rid of the depth range prior.
We introduce a Multi-view Disparity Attention (MDA) module to aggregate long-range context information.
We explicitly estimate the quality of the current pixel corresponding to sampled points on the epipolar line of the source image.
arXiv Detail & Related papers (2024-11-04T08:50:16Z) - CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition.
Our method uses the attention mechanism to correlate multiple images within a batch.
Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z) - Multi-Technique Sequential Information Consistency For Dynamic Visual
Place Recognition In Changing Environments [23.33092172788319]
Visual place recognition (VPR) is an essential component of robot navigation and localization systems.
No single VPR technique excels in every environmental condition.
We propose a VPR system dubbed Multi-Sequential Information Consistency (MuSIC)
arXiv Detail & Related papers (2024-01-16T10:35:01Z) - Diffusion-based Visual Counterfactual Explanations -- Towards Systematic
Quantitative Evaluation [64.0476282000118]
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality.
It is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies.
We propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used.
arXiv Detail & Related papers (2023-08-11T12:22:37Z) - Learning from Multi-Perception Features for Real-Word Image
Super-resolution [87.71135803794519]
We propose a novel SR method called MPF-Net that leverages multiple perceptual features of input images.
Our method incorporates a Multi-Perception Feature Extraction (MPFE) module to extract diverse perceptual information.
We also introduce a contrastive regularization term (CR) that improves the model's learning capability.
arXiv Detail & Related papers (2023-05-26T07:35:49Z) - A-MuSIC: An Adaptive Ensemble System For Visual Place Recognition In
Changing Environments [22.58641358408613]
Visual place recognition (VPR) is an essential component of robot navigation and localization systems.
No single VPR technique excels in every environmental condition.
adaptive VPR system dubbed Adaptive Multi-Self Identification and Correction (A-MuSIC)
A-MuSIC matches or beats state-of-the-art VPR performance across all tested benchmark datasets.
arXiv Detail & Related papers (2023-03-24T19:25:22Z) - Boosting Performance of a Baseline Visual Place Recognition Technique by
Predicting the Maximally Complementary Technique [25.916992891359055]
One recent promising approach to the Visual Place Recognition problem has been to fuse the place recognition estimates of multiple complementary VPR techniques.
These approaches require all potential VPR methods to be brute-force run before they are selectively fused.
Here we propose an alternative approach that instead starts with a known single base VPR technique, and learns to predict the most complementary additional VPR technique to fuse with it.
arXiv Detail & Related papers (2022-10-14T04:32:23Z) - Improving Visual Place Recognition Performance by Maximising
Complementarity [22.37892767050086]
This paper investigates the complementarity of state-of-the-art VPR methods systematically for the first time.
It identifies those combinations which can result in better performance.
Results are presented for eight state-of-the-art VPR methods on ten widely-used VPR datasets.
arXiv Detail & Related papers (2021-02-16T19:18:33Z) - Intelligent Reference Curation for Visual Place Recognition via Bayesian
Selective Fusion [24.612272323346144]
Key challenge in visual place recognition is recognizing places despite drastic visual appearance changes.
We propose a novel approach, dubbed Bayesian Selective Fusion, for actively selecting and fusing informative reference images.
Our approach is well suited for long-term robot autonomy where dynamic visual environments are commonplace.
arXiv Detail & Related papers (2020-10-19T05:17:35Z) - Weakly supervised cross-domain alignment with optimal transport [102.8572398001639]
Cross-domain alignment between image objects and text sequences is key to many visual-language tasks.
This paper investigates a novel approach for the identification and optimization of fine-grained semantic similarities between image and text entities.
arXiv Detail & Related papers (2020-08-14T22:48:36Z) - MuCAN: Multi-Correspondence Aggregation Network for Video
Super-Resolution [63.02785017714131]
Video super-resolution (VSR) aims to utilize multiple low-resolution frames to generate a high-resolution prediction for each frame.
Inter- and intra-frames are the key sources for exploiting temporal and spatial information.
We build an effective multi-correspondence aggregation network (MuCAN) for VSR.
arXiv Detail & Related papers (2020-07-23T05:41:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.