SeqNet: Learning Descriptors for Sequence-based Hierarchical Place
Recognition
- URL: http://arxiv.org/abs/2102.11603v2
- Date: Wed, 24 Feb 2021 01:52:08 GMT
- Title: SeqNet: Learning Descriptors for Sequence-based Hierarchical Place
Recognition
- Authors: Sourav Garg and Michael Milford
- Abstract summary: We present a novel hybrid system that creates a high performance initial match hypothesis generator.
Sequence descriptors are generated using a temporal convolutional network dubbed SeqNet.
We then perform selective sequential score aggregation using shortlisted single image learnt descriptors to produce an overall place match hypothesis.
- Score: 31.714928102950594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual Place Recognition (VPR) is the task of matching current visual imagery
from a camera to images stored in a reference map of the environment. While
initial VPR systems used simple direct image methods or hand-crafted visual
features, recent work has focused on learning more powerful visual features and
further improving performance through either some form of sequential matcher /
filter or a hierarchical matching process. In both cases the performance of the
initial single-image based system is still far from perfect, putting
significant pressure on the sequence matching or (in the case of hierarchical
systems) pose refinement stages. In this paper we present a novel hybrid system
that creates a high performance initial match hypothesis generator using short
learnt sequential descriptors, which enable selective control sequential score
aggregation using single image learnt descriptors. Sequential descriptors are
generated using a temporal convolutional network dubbed SeqNet, encoding short
image sequences using 1-D convolutions, which are then matched against the
corresponding temporal descriptors from the reference dataset to provide an
ordered list of place match hypotheses. We then perform selective sequential
score aggregation using shortlisted single image learnt descriptors from a
separate pipeline to produce an overall place match hypothesis. Comprehensive
experiments on challenging benchmark datasets demonstrate the proposed method
outperforming recent state-of-the-art methods using the same amount of
sequential information. Source code and supplementary material can be found at
https://github.com/oravus/seqNet.
Related papers
- Sentence-level Prompts Benefit Composed Image Retrieval [69.78119883060006]
Composed image retrieval (CIR) is the task of retrieving specific images by using a query that involves both a reference image and a relative caption.
We propose to leverage pretrained V-L models, e.g., BLIP-2, to generate sentence-level prompts.
Our proposed method performs favorably against the state-of-the-art CIR methods on the Fashion-IQ and CIRR datasets.
arXiv Detail & Related papers (2023-10-09T07:31:44Z) - Efficient Match Pair Retrieval for Large-scale UAV Images via Graph
Indexed Global Descriptor [9.402103660431791]
This paper proposes an efficient match pair retrieval method and implements an integrated workflow for parallel SfM reconstruction.
The proposed solution has been verified using three large-scale datasets.
arXiv Detail & Related papers (2023-07-10T12:41:55Z) - Graph Convolution Based Efficient Re-Ranking for Visual Retrieval [29.804582207550478]
We present an efficient re-ranking method which refines initial retrieval results by updating features.
Specifically, we reformulate re-ranking based on Graph Convolution Networks (GCN) and propose a novel Graph Convolution based Re-ranking (GCR) for visual retrieval tasks via feature propagation.
In particular, the plain GCR is extended for cross-camera retrieval and an improved feature propagation formulation is presented to leverage affinity relationships across different cameras.
arXiv Detail & Related papers (2023-06-15T00:28:08Z) - Learning Sequence Descriptor based on Spatio-Temporal Attention for
Visual Place Recognition [16.380948630155476]
Visual Place Recognition (VPR) aims to retrieve frames from atagged database that are located at the same place as the query frame.
To improve the robustness of VPR in geoly aliasing scenarios, sequence-based VPR methods are proposed.
We use a sliding window to control the temporal range of attention and use relative positional encoding to construct sequential relationships between different features.
arXiv Detail & Related papers (2023-05-19T06:39:10Z) - ASIC: Aligning Sparse in-the-wild Image Collections [86.66498558225625]
We present a method for joint alignment of sparse in-the-wild image collections of an object category.
We use pairwise nearest neighbors obtained from deep features of a pre-trained vision transformer (ViT) model as noisy and sparse keypoint matches.
Experiments on CUB and SPair-71k benchmarks demonstrate that our method can produce globally consistent and higher quality correspondences.
arXiv Detail & Related papers (2023-03-28T17:59:28Z) - Reuse your features: unifying retrieval and feature-metric alignment [3.845387441054033]
DRAN is the first network able to produce the features for the three steps of visual localization.
It achieves competitive performance in terms of robustness and accuracy under challenging conditions in public benchmarks.
arXiv Detail & Related papers (2022-04-13T10:42:00Z) - Reinforcement Learning Based Query Vertex Ordering Model for Subgraph
Matching [58.39970828272366]
Subgraph matching algorithms enumerate all is embeddings of a query graph in a data graph G.
matching order plays a critical role in time efficiency of these backtracking based subgraph matching algorithms.
In this paper, for the first time we apply the Reinforcement Learning (RL) and Graph Neural Networks (GNNs) techniques to generate the high-quality matching order for subgraph matching algorithms.
arXiv Detail & Related papers (2022-01-25T00:10:03Z) - Contextual Similarity Aggregation with Self-attention for Visual
Re-ranking [96.55393026011811]
We propose a visual re-ranking method by contextual similarity aggregation with self-attention.
We conduct comprehensive experiments on four benchmark datasets to demonstrate the generality and effectiveness of our proposed visual re-ranking method.
arXiv Detail & Related papers (2021-10-26T06:20:31Z) - Efficient image retrieval using multi neural hash codes and bloom
filters [0.0]
This paper delivers an efficient and modified approach for image retrieval using multiple neural hash codes.
It also limits the number of queries using bloom filters by identifying false positives beforehand.
arXiv Detail & Related papers (2020-11-06T08:46:31Z) - Learning to Compose Hypercolumns for Visual Correspondence [57.93635236871264]
We introduce a novel approach to visual correspondence that dynamically composes effective features by leveraging relevant layers conditioned on the images to match.
The proposed method, dubbed Dynamic Hyperpixel Flow, learns to compose hypercolumn features on the fly by selecting a small number of relevant layers from a deep convolutional neural network.
arXiv Detail & Related papers (2020-07-21T04:03:22Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.