Tightly Coupled Learning Strategy for Weakly Supervised Hierarchical
Place Recognition
- URL: http://arxiv.org/abs/2202.06470v1
- Date: Mon, 14 Feb 2022 03:20:39 GMT
- Title: Tightly Coupled Learning Strategy for Weakly Supervised Hierarchical
Place Recognition
- Authors: Y. Shen, R. Wang, W. Zuo, N. Zheng
- Abstract summary: We propose a tightly coupled learning (TCL) strategy to train triplet models.
It combines global and local descriptors for joint optimization.
Our lightweight unified model is better than several state-of-the-art methods.
- Score: 0.09558392439655011
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visual place recognition (VPR) is a key issue for robotics and autonomous
systems. For the trade-off between time and performance, most of methods use
the coarse-to-fine hierarchical architecture, which consists of retrieving
top-N candidates using global features, and re-ranking top-N with local
features. However, since the two types of features are usually processed
independently, re-ranking may harm global retrieval, termed re-ranking
confusion. Moreover, re-ranking is limited by global retrieval. In this paper,
we propose a tightly coupled learning (TCL) strategy to train triplet models.
Different from original triplet learning (OTL) strategy, it combines global and
local descriptors for joint optimization. In addition, a bidirectional search
dynamic time warping (BS-DTW) algorithm is also proposed to mine locally
spatial information tailored to VPR in re-ranking. The experimental results on
public benchmarks show that the models using TCL outperform the models using
OTL, and TCL can be used as a general strategy to improve performance for
weakly supervised ranking tasks. Further, our lightweight unified model is
better than several state-of-the-art methods and has over an order of magnitude
of computational efficiency to meet the real-time requirements of robots.
Related papers
- How to Make LLMs Strong Node Classifiers? [70.14063765424012]
Language Models (LMs) are challenging the dominance of domain-specific models, such as Graph Neural Networks (GNNs) and Graph Transformers (GTs)
We propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art (SOTA) GNNs on node classification tasks.
arXiv Detail & Related papers (2024-10-03T08:27:54Z) - AANet: Aggregation and Alignment Network with Semi-hard Positive Sample
Mining for Hierarchical Place Recognition [48.043749855085025]
Visual place recognition (VPR) is one of the research hotspots in robotics, which uses visual information to locate robots.
We present a unified network capable of extracting global features for retrieving candidates via an aggregation module.
We also propose a Semi-hard Positive Sample Mining (ShPSM) strategy to select appropriate hard positive images for training more robust VPR networks.
arXiv Detail & Related papers (2023-10-08T14:46:11Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Global-to-Local Modeling for Video-based 3D Human Pose and Shape
Estimation [53.04781510348416]
Video-based 3D human pose and shape estimations are evaluated by intra-frame accuracy and inter-frame smoothness.
We propose to structurally decouple the modeling of long-term and short-term correlations in an end-to-end framework, Global-to-Local Transformer (GLoT)
Our GLoT surpasses previous state-of-the-art methods with the lowest model parameters on popular benchmarks, i.e., 3DPW, MPI-INF-3DHP, and Human3.6M.
arXiv Detail & Related papers (2023-03-26T14:57:49Z) - A Faster, Lighter and Stronger Deep Learning-Based Approach for Place
Recognition [7.9400442516053475]
We propose a faster, lighter and stronger approach that can generate models with fewer parameters and can spend less time in the inference stage.
We design RepVGG-lite as the backbone network in our architecture, it is more discriminative than other general networks in the Place Recognition task.
Our system has 14 times less params than Patch-NetVLAD, 6.8 times lower theoretical FLOPs, and run faster 21 and 33 times in feature extraction and feature matching.
arXiv Detail & Related papers (2022-11-27T15:46:53Z) - Residual Local Feature Network for Efficient Super-Resolution [20.62809970985125]
In this work, we propose a novel Residual Local Feature Network (RLFN)
The main idea is using three convolutional layers for residual local feature learning to simplify feature aggregation.
In addition, we won the first place in the runtime track of the NTIRE 2022 efficient super-resolution challenge.
arXiv Detail & Related papers (2022-05-16T08:46:34Z) - AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures.
We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS.
Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z) - Sequential Place Learning: Heuristic-Free High-Performance Long-Term
Place Recognition [24.70946979449572]
We develop a learning-based CNN+LSTM architecture, trainable via backpropagation through time, for viewpoint- and appearance-invariant place recognition.
Our model outperforms 15 classical methods while setting new state-of-the-art performance standards.
In addition, we show that SPL can be up to 70x faster to deploy than classical methods on a 729 km route.
arXiv Detail & Related papers (2021-03-02T22:57:43Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.