Adversarial Learning of Hard Positives for Place Recognition
- URL: http://arxiv.org/abs/2205.03871v1
- Date: Sun, 8 May 2022 13:54:03 GMT
- Title: Adversarial Learning of Hard Positives for Place Recognition
- Authors: Wenxuan Fang, Kai Zhang, Yoli Shavit and Wensen Feng
- Abstract summary: We propose an adversarial method to guide the creation of hard positives for training image retrieval networks.
Our method achieves state-of-the-art recalls on the Pitts250 and Tokyo 24/7 benchmarks.
- Score: 5.142439069733352
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image retrieval methods for place recognition learn global image descriptors
that are used for fetching geo-tagged images at inference time. Recent works
have suggested employing weak and self-supervision for mining hard positives
and hard negatives in order to improve localization accuracy and robustness to
visibility changes (e.g. in illumination or view point). However, generating
hard positives, which is essential for obtaining robustness, is still limited
to hard-coded or global augmentations. In this work we propose an adversarial
method to guide the creation of hard positives for training image retrieval
networks. Our method learns local and global augmentation policies which will
increase the training loss, while the image retrieval network is forced to
learn more powerful features for discriminating increasingly difficult
examples. This approach allows the image retrieval network to generalize beyond
the hard examples presented in the data and learn features that are robust to a
wide range of variations. Our method achieves state-of-the-art recalls on the
Pitts250 and Tokyo 24/7 benchmarks and outperforms recent image retrieval
methods on the rOxford and rParis datasets by a noticeable margin.
Related papers
- Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image Enhancement [59.17372460692809]
This work proposes a mean-teacher-based semi-supervised low-light enhancement (Semi-LLIE) framework that integrates the unpaired data into model training.
We introduce a semantic-aware contrastive loss to faithfully transfer the illumination distribution, contributing to enhancing images with natural colors.
We also propose novel perceptive loss based on the large-scale vision-language Recognize Anything Model (RAM) to help generate enhanced images with richer textual details.
arXiv Detail & Related papers (2024-09-25T04:05:32Z) - CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition.
Our method uses the attention mechanism to correlate multiple images within a batch.
Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z) - Specialized Re-Ranking: A Novel Retrieval-Verification Framework for
Cloth Changing Person Re-Identification [36.4001616893874]
Re-ID can work under more complicated scenarios with higher security than normal Re-ID and biometric techniques.
We propose a novel retrieval-verification framework to handle similar images.
arXiv Detail & Related papers (2022-10-07T14:47:28Z) - Digging Into Self-Supervised Learning of Feature Descriptors [14.47046413243358]
We propose a set of improvements that combined lead to powerful feature descriptors.
We show that increasing the search space from in-pair to in-batch for hard negative mining brings consistent improvement.
We demonstrate that a combination of synthetic homography transformation, color augmentation, and photorealistic image stylization produces useful representations.
arXiv Detail & Related papers (2021-10-10T12:22:44Z) - Region-level Active Learning for Cluttered Scenes [60.93811392293329]
We introduce a new strategy that subsumes previous Image-level and Object-level approaches into a generalized, Region-level approach.
We show that this approach significantly decreases labeling effort and improves rare object search on realistic data with inherent class-imbalance and cluttered scenes.
arXiv Detail & Related papers (2021-08-20T14:02:38Z) - Few-Shot Learning with Part Discovery and Augmentation from Unlabeled
Images [79.34600869202373]
We show that inductive bias can be learned from a flat collection of unlabeled images, and instantiated as transferable representations among seen and unseen classes.
Specifically, we propose a novel part-based self-supervised representation learning scheme to learn transferable representations.
Our method yields impressive results, outperforming the previous best unsupervised methods by 7.74% and 9.24%.
arXiv Detail & Related papers (2021-05-25T12:22:11Z) - Unifying Remote Sensing Image Retrieval and Classification with Robust
Fine-tuning [3.6526118822907594]
We aim at unifying remote sensing image retrieval and classification with a new large-scale training and testing dataset, SF300.
We show that our framework systematically achieves a boost of retrieval and classification performance on nine different datasets compared to an ImageNet pretrained baseline.
arXiv Detail & Related papers (2021-02-26T11:01:30Z) - Exploiting Web Images for Fine-Grained Visual Recognition by Eliminating
Noisy Samples and Utilizing Hard Ones [60.07027312916081]
We propose a novel approach for removing irrelevant samples from real-world web images during training.
Our approach can alleviate the harmful effects of irrelevant noisy web images and hard examples to achieve better performance.
arXiv Detail & Related papers (2021-01-23T03:58:10Z) - Grafit: Learning fine-grained image representations with coarse labels [114.17782143848315]
This paper tackles the problem of learning a finer representation than the one provided by training labels.
By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
arXiv Detail & Related papers (2020-11-25T19:06:26Z) - Self-supervising Fine-grained Region Similarities for Large-scale Image
Localization [43.1611420685653]
General public benchmarks only provide noisy GPS labels for learning image-to-image similarities.
We propose to self-supervise image-to-region similarities in order to fully explore the potential of difficult positive images alongside their sub-regions.
Our proposed self-enhanced image-to-region similarity labels effectively deal with the training bottleneck in the state-of-the-art pipelines.
arXiv Detail & Related papers (2020-06-06T17:31:52Z) - Learning Test-time Augmentation for Content-based Image Retrieval [42.188013259368766]
Off-the-shelf convolutional neural network features achieve outstanding results in many image retrieval tasks.
Existing image retrieval approaches require fine-tuning or modification of pre-trained networks to adapt to variations unique to the target data.
Our method enhances the invariance of off-the-shelf features by aggregating features extracted from images augmented at test-time, with augmentations guided by a policy learned through reinforcement learning.
arXiv Detail & Related papers (2020-02-05T05:08:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.