Spatially Consistent Representation Learning
- URL: http://arxiv.org/abs/2103.06122v1
- Date: Wed, 10 Mar 2021 15:23:45 GMT
- Title: Spatially Consistent Representation Learning
- Authors: Byungseok Roh, Wuhyun Shin, Ildoo Kim, Sungwoong Kim
- Abstract summary: We propose a spatially consistent representation learning algorithm (SCRL) for multi-object and location-specific tasks.
We devise a novel self-supervised objective that tries to produce coherent spatial representations of a randomly cropped local region.
On various downstream localization tasks with benchmark datasets, the proposed SCRL shows significant performance improvements.
- Score: 12.120041613482558
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Self-supervised learning has been widely used to obtain transferrable
representations from unlabeled images. Especially, recent contrastive learning
methods have shown impressive performances on downstream image classification
tasks. While these contrastive methods mainly focus on generating invariant
global representations at the image-level under semantic-preserving
transformations, they are prone to overlook spatial consistency of local
representations and therefore have a limitation in pretraining for localization
tasks such as object detection and instance segmentation. Moreover,
aggressively cropped views used in existing contrastive methods can minimize
representation distances between the semantically different regions of a single
image.
In this paper, we propose a spatially consistent representation learning
algorithm (SCRL) for multi-object and location-specific tasks. In particular,
we devise a novel self-supervised objective that tries to produce coherent
spatial representations of a randomly cropped local region according to
geometric translations and zooming operations. On various downstream
localization tasks with benchmark datasets, the proposed SCRL shows significant
performance improvements over the image-level supervised pretraining as well as
the state-of-the-art self-supervised learning methods.
Related papers
- DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation [8.422110274212503]
Weakly supervised semantic segmentation approaches typically rely on class activation maps (CAMs) for initial seed generation.
We introduce DALNet, which leverages text embeddings to enhance the comprehensive understanding and precise localization of objects across different levels of granularity.
Our approach, in particular, allows for more efficient end-to-end process as a single-stage method.
arXiv Detail & Related papers (2024-09-24T06:51:49Z) - Pixel-Level Domain Adaptation: A New Perspective for Enhancing Weakly Supervised Semantic Segmentation [13.948425538725138]
We propose a Pixel-Level Domain Adaptation (PLDA) method to encourage the model in learning pixel-wise domain-invariant features.
We experimentally demonstrate the effectiveness of our approach under a wide range of settings.
arXiv Detail & Related papers (2024-08-04T14:14:54Z) - Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning [50.88504784466931]
Multi-task dense prediction involves semantic segmentation, depth estimation, and surface normal estimation.
Existing solutions typically rely on learning global image representations for global cross-task image matching.
Our proposal involves modeling region-wise representations using Gaussian Distributions.
arXiv Detail & Related papers (2024-03-15T12:41:30Z) - Localized Region Contrast for Enhancing Self-Supervised Learning in
Medical Image Segmentation [27.82940072548603]
We propose a novel contrastive learning framework that integrates Localized Region Contrast (LRC) to enhance existing self-supervised pre-training methods for medical image segmentation.
Our approach involves identifying Super-pixels by Felzenszwalb's algorithm and performing local contrastive learning using a novel contrastive sampling loss.
arXiv Detail & Related papers (2023-04-06T22:43:13Z) - Towards Effective Image Manipulation Detection with Proposal Contrastive
Learning [61.5469708038966]
We propose Proposal Contrastive Learning (PCL) for effective image manipulation detection.
Our PCL consists of a two-stream architecture by extracting two types of global features from RGB and noise views respectively.
Our PCL can be easily adapted to unlabeled data in practice, which can reduce manual labeling costs and promote more generalizable features.
arXiv Detail & Related papers (2022-10-16T13:30:13Z) - Deep face recognition with clustering based domain adaptation [57.29464116557734]
We propose a new clustering-based domain adaptation method designed for face recognition task in which the source and target domain do not share any classes.
Our method effectively learns the discriminative target feature by aligning the feature domain globally, and, at the meantime, distinguishing the target clusters locally.
arXiv Detail & Related papers (2022-05-27T12:29:11Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Self-supervised Contrastive Learning for Cross-domain Hyperspectral
Image Representation [26.610588734000316]
This paper introduces a self-supervised learning framework suitable for hyperspectral images that are inherently challenging to annotate.
The proposed framework architecture leverages cross-domain CNN, allowing for learning representations from different hyperspectral images.
The experimental results demonstrate the advantage of the proposed self-supervised representation over models trained from scratch or other transfer learning methods.
arXiv Detail & Related papers (2022-02-08T16:16:45Z) - Towards Fewer Annotations: Active Learning via Region Impurity and
Prediction Uncertainty for Domain Adaptive Semantic Segmentation [19.55572909866489]
We propose a region-based active learning approach for semantic segmentation under a domain shift.
Our algorithm, Active Learning via Region Impurity and Prediction Uncertainty (AL-RIPU), introduces a novel acquisition strategy characterizing the spatial adjacency of image regions.
Our method only requires very few annotations to almost reach the supervised performance and substantially outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-11-25T06:40:58Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.