Related papers: Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning

Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning

URL: http://arxiv.org/abs/2212.06486v1
Date: Tue, 13 Dec 2022 11:13:59 GMT
Title: Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning
Authors: Kaiyou Song, Shan Zhang, Zihao An, Zimeng Luo, Tong Wang, Jin Xie
Abstract summary: It is unavoidable to construct undesirable views containing different semantic concepts during the augmentation procedure. It would damage the semantic consistency of representation to pull these augmentations closer in the feature space indiscriminately. In this study, we introduce feature-level augmentation and propose a novel semantics-consistent feature search (SCFS) method to mitigate this negative effect.
Score: 15.242064747740116
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In contrastive self-supervised learning, the common way to learn discriminative representation is to pull different augmented "views" of the same image closer while pushing all other images further apart, which has been proven to be effective. However, it is unavoidable to construct undesirable views containing different semantic concepts during the augmentation procedure. It would damage the semantic consistency of representation to pull these augmentations closer in the feature space indiscriminately. In this study, we introduce feature-level augmentation and propose a novel semantics-consistent feature search (SCFS) method to mitigate this negative effect. The main idea of SCFS is to adaptively search semantics-consistent features to enhance the contrast between semantics-consistent regions in different augmentations. Thus, the trained model can learn to focus on meaningful object regions, improving the semantic representation ability. Extensive experiments conducted on different datasets and tasks demonstrate that SCFS effectively improves the performance of self-supervised learning and achieves state-of-the-art performance on different downstream tasks.

Related papers

Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning [58.73625654718187]
Generalized zero-shot learning aims to recognize both seen and unseen classes with the help of semantic information that is shared among different classes. Existing approaches fine-tune the visual backbone by seen-class data to obtain semantic-related visual features. This paper proposes a novel visual and semantic prompt collaboration framework, which utilizes prompt tuning techniques for efficient feature adaptation.
arXiv Detail & Related papers (2025-03-29T10:17:57Z)
Semi-supervised Semantic Segmentation for Remote Sensing Images via Multi-scale Uncertainty Consistency and Cross-Teacher-Student Attention [59.19580789952102]
This paper proposes a novel semi-supervised Multi-Scale Uncertainty and Cross-Teacher-Student Attention (MUCA) model for RS image semantic segmentation tasks. MUCA constrains the consistency among feature maps at different layers of the network by introducing a multi-scale uncertainty consistency regularization. MUCA utilizes a Cross-Teacher-Student attention mechanism to guide the student network, guiding the student network to construct more discriminative feature representations.
arXiv Detail & Related papers (2025-01-18T11:57:20Z)
PP-SSL : Priority-Perception Self-Supervised Learning for Fine-Grained Recognition [28.863121559446665]
Self-supervised learning is emerging in fine-grained visual recognition with promising results. Existing self-supervised learning methods are susceptible to irrelevant patterns in self-supervised tasks. We propose a novel Priority-Perception Self-Supervised Learning framework, denoted as PP-SSL.
arXiv Detail & Related papers (2024-11-28T15:47:41Z)
Feature Augmentation for Self-supervised Contrastive Learning: A Closer Look [28.350278251132078]
We propose a unified framework to conduct data augmentation in the feature space, known as feature augmentation. This strategy is domain-agnostic, which augments similar features to the original ones and thus improves the data diversity.
arXiv Detail & Related papers (2024-10-16T09:25:11Z)
A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels. We present a generative latent variable model for self-supervised learning. We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z)
Contrastive Learning of View-Invariant Representations for Facial Expressions Recognition [27.75143621836449]
We propose ViewFX, a novel view-invariant FER framework based on contrastive learning. We test the proposed framework on two public multi-view facial expression recognition datasets.
arXiv Detail & Related papers (2023-11-12T14:05:09Z)
Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition [16.412306012741354]
We propose a self-supervised framework called Focalized Contrastive View-invariant Learning (FoCoViL) FoCoViL significantly suppresses the view-specific information on the representation space where the viewpoints are coarsely aligned. It associates actions with common view-invariant properties and simultaneously separates the dissimilar ones.
arXiv Detail & Related papers (2023-04-03T10:12:30Z)
Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems [61.11799513362704]
We propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes. We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective.
arXiv Detail & Related papers (2023-03-03T02:07:40Z)
Unsupervised Feature Clustering Improves Contrastive Representation Learning for Medical Image Segmentation [18.75543045234889]
Self-supervised instance discrimination is an effective contrastive pretext task to learn feature representations and address limited medical image annotations. We propose a new self-supervised contrastive learning method that uses unsupervised feature clustering to better select positive and negative image samples. Our method outperforms state-of-the-art self-supervised contrastive techniques on these tasks.
arXiv Detail & Related papers (2022-11-15T22:54:29Z)
Weak Augmentation Guided Relational Self-Supervised Learning [80.0680103295137]
We introduce a novel relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances. Our proposed method employs sharpened distribution of pairwise similarities among different instances as textitrelation metric. Experimental results show that our proposed ReSSL substantially outperforms the state-of-the-art methods across different network architectures.
arXiv Detail & Related papers (2022-03-16T16:14:19Z)
Dense Contrastive Visual-Linguistic Pretraining [53.61233531733243]
Several multimodal representation learning approaches have been proposed that jointly represent image and text. These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining. We propose unbiased Dense Contrastive Visual-Linguistic Pretraining to replace the region regression and classification with cross-modality region contrastive learning.
arXiv Detail & Related papers (2021-09-24T07:20:13Z)
Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations [183.03278932562438]
This paper presents an effective approach that adds spatial information to the encoding stage to alleviate the learning inconsistency between the contrastive objective and strong data augmentation operations. We show that our approach achieves higher efficiency in visual representations and thus delivers a key message to inspire the future research of self-supervised visual representation learning.
arXiv Detail & Related papers (2020-11-19T16:26:25Z)
Can Semantic Labels Assist Self-Supervised Visual Representation Learning? [194.1681088693248]
We present a new algorithm named Supervised Contrastive Adjustment in Neighborhood (SCAN) In a series of downstream tasks, SCAN achieves superior performance compared to previous fully-supervised and self-supervised methods. Our study reveals that semantic labels are useful in assisting self-supervised methods, opening a new direction for the community.
arXiv Detail & Related papers (2020-11-17T13:25:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.