Self-supervising Fine-grained Region Similarities for Large-scale Image
Localization
- URL: http://arxiv.org/abs/2006.03926v2
- Date: Thu, 9 Jul 2020 06:21:08 GMT
- Title: Self-supervising Fine-grained Region Similarities for Large-scale Image
Localization
- Authors: Yixiao Ge, Haibo Wang, Feng Zhu, Rui Zhao, Hongsheng Li
- Abstract summary: General public benchmarks only provide noisy GPS labels for learning image-to-image similarities.
We propose to self-supervise image-to-region similarities in order to fully explore the potential of difficult positive images alongside their sub-regions.
Our proposed self-enhanced image-to-region similarity labels effectively deal with the training bottleneck in the state-of-the-art pipelines.
- Score: 43.1611420685653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of large-scale retrieval-based image localization is to estimate the
geographical location of a query image by recognizing its nearest reference
images from a city-scale dataset. However, the general public benchmarks only
provide noisy GPS labels associated with the training images, which act as weak
supervisions for learning image-to-image similarities. Such label noise
prevents deep neural networks from learning discriminative features for
accurate localization. To tackle this challenge, we propose to self-supervise
image-to-region similarities in order to fully explore the potential of
difficult positive images alongside their sub-regions. The estimated
image-to-region similarities can serve as extra training supervision for
improving the network in generations, which could in turn gradually refine the
fine-grained similarities to achieve optimal performance. Our proposed
self-enhanced image-to-region similarity labels effectively deal with the
training bottleneck in the state-of-the-art pipelines without any additional
parameters or manual annotations in both training and inference. Our method
outperforms state-of-the-arts on the standard localization benchmarks by
noticeable margins and shows excellent generalization capability on multiple
image retrieval datasets.
Related papers
- Progressive Feature Self-reinforcement for Weakly Supervised Semantic
Segmentation [55.69128107473125]
We propose a single-stage approach for Weakly Supervised Semantic (WSSS) with image-level labels.
We adaptively partition the image content into deterministic regions (e.g., confident foreground and background) and uncertain regions (e.g., object boundaries and misclassified categories) for separate processing.
Building upon this, we introduce a complementary self-enhancement method that constrains the semantic consistency between these confident regions and an augmented image with the same class labels.
arXiv Detail & Related papers (2023-12-14T13:21:52Z) - Mitigating Urban-Rural Disparities in Contrastive Representation Learning with Satellite Imagery [19.93324644519412]
We consider the risk of urban-rural disparities in identification of land-cover features.
We propose fair dense representation with contrastive learning (FairDCL) as a method for de-biasing the multi-level latent space of convolution neural network models.
The obtained image representation mitigates downstream urban-rural prediction disparities and outperforms state-of-the-art baselines on real-world satellite images.
arXiv Detail & Related papers (2022-11-16T04:59:46Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Local contrastive loss with pseudo-label based self-training for
semi-supervised medical image segmentation [13.996217500923413]
Semi/self-supervised learning-based approaches exploit unlabeled data along with limited annotated data.
Recent self-supervised learning methods use contrastive loss to learn good global level representations from unlabeled images.
We propose a local contrastive loss to learn good pixel level features useful for segmentation by exploiting semantic label information.
arXiv Detail & Related papers (2021-12-17T17:38:56Z) - Region-level Active Learning for Cluttered Scenes [60.93811392293329]
We introduce a new strategy that subsumes previous Image-level and Object-level approaches into a generalized, Region-level approach.
We show that this approach significantly decreases labeling effort and improves rare object search on realistic data with inherent class-imbalance and cluttered scenes.
arXiv Detail & Related papers (2021-08-20T14:02:38Z) - Spatially Consistent Representation Learning [12.120041613482558]
We propose a spatially consistent representation learning algorithm (SCRL) for multi-object and location-specific tasks.
We devise a novel self-supervised objective that tries to produce coherent spatial representations of a randomly cropped local region.
On various downstream localization tasks with benchmark datasets, the proposed SCRL shows significant performance improvements.
arXiv Detail & Related papers (2021-03-10T15:23:45Z) - Re-rank Coarse Classification with Local Region Enhanced Features for
Fine-Grained Image Recognition [22.83821575990778]
We re-rank the TopN classification results by using the local region enhanced embedding features to improve the Top1 accuracy.
To learn more effective semantic global features, we design a multi-level loss over an automatically constructed hierarchical category structure.
Our method achieves state-of-the-art performance on three benchmarks: CUB-200-2011, Stanford Cars, and FGVC Aircraft.
arXiv Detail & Related papers (2021-02-19T11:30:25Z) - Unsupervised Metric Relocalization Using Transform Consistency Loss [66.19479868638925]
Training networks to perform metric relocalization traditionally requires accurate image correspondences.
We propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration.
We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
arXiv Detail & Related papers (2020-11-01T19:24:27Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.