Self-Supervised Learning Disentangled Group Representation as Feature
- URL: http://arxiv.org/abs/2110.15255v2
- Date: Fri, 29 Oct 2021 11:14:37 GMT
- Title: Self-Supervised Learning Disentangled Group Representation as Feature
- Authors: Tan Wang, Zhongqi Yue, Jianqiang Huang, Qianru Sun, Hanwang Zhang
- Abstract summary: We show that existing Self-Supervised Learning (SSL) only disentangles simple augmentation features such as rotation and colorization.
We propose an iterative SSL algorithm: Iterative Partition-based Invariant Risk Minimization (IP-IRM)
We prove that IP-IRM converges to a fully disentangled representation and show its effectiveness on various benchmarks.
- Score: 82.07737719232972
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A good visual representation is an inference map from observations (images)
to features (vectors) that faithfully reflects the hidden modularized
generative factors (semantics). In this paper, we formulate the notion of
"good" representation from a group-theoretic view using Higgins' definition of
disentangled representation, and show that existing Self-Supervised Learning
(SSL) only disentangles simple augmentation features such as rotation and
colorization, thus unable to modularize the remaining semantics. To break the
limitation, we propose an iterative SSL algorithm: Iterative Partition-based
Invariant Risk Minimization (IP-IRM), which successfully grounds the abstract
semantics and the group acting on them into concrete contrastive learning. At
each iteration, IP-IRM first partitions the training samples into two subsets
that correspond to an entangled group element. Then, it minimizes a
subset-invariant contrastive loss, where the invariance guarantees to
disentangle the group element. We prove that IP-IRM converges to a fully
disentangled representation and show its effectiveness on various benchmarks.
Codes are available at https://github.com/Wangt-CN/IP-IRM.
Related papers
- Grouped Discrete Representation for Object-Centric Learning [18.44580501357929]
We propose textitGroup Discrete Representation (GDR) for Object-Centric Learning.
GDR decomposes features into attributes via organized channel grouping, and composes these attributes into discrete representation via indexes.
arXiv Detail & Related papers (2024-11-04T17:25:10Z) - Deep Contrastive Multi-view Clustering under Semantic Feature Guidance [8.055452424643562]
We propose a multi-view clustering framework named Deep Contrastive Multi-view Clustering under Semantic feature guidance (DCMCS)
By minimizing instance-level contrastive loss weighted by semantic similarity, DCMCS adaptively weakens contrastive leaning between false negative pairs.
Experimental results on several public datasets demonstrate the proposed framework outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2024-03-09T02:33:38Z) - Reflection Invariance Learning for Few-shot Semantic Segmentation [53.20466630330429]
Few-shot semantic segmentation (FSS) aims to segment objects of unseen classes in query images with only a few annotated support images.
This paper proposes a fresh few-shot segmentation framework to mine the reflection invariance in a multi-view matching manner.
Experiments on both PASCAL-$5textiti$ and COCO-$20textiti$ datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-01T15:14:58Z) - Semantics-Aware Dynamic Localization and Refinement for Referring Image
Segmentation [102.25240608024063]
Referring image segments an image from a language expression.
We develop an algorithm that shifts from being localization-centric to segmentation-language.
Compared to its counterparts, our method is more versatile yet effective.
arXiv Detail & Related papers (2023-03-11T08:42:40Z) - Semantic-aware Contrastive Learning for More Accurate Semantic Parsing [32.74456368167872]
We propose a semantic-aware contrastive learning algorithm, which can learn to distinguish fine-grained meaning representations.
Experiments on two standard datasets show that our approach achieves significant improvements over MLE baselines.
arXiv Detail & Related papers (2023-01-19T07:04:32Z) - Unsupervised Visual Representation Learning by Synchronous Momentum
Grouping [47.48803765951601]
Group-level contrastive visual representation learning method on ImageNet surpasses vanilla supervised learning.
We conduct exhaustive experiments to show that SMoG has surpassed the current SOTA unsupervised representation learning methods.
arXiv Detail & Related papers (2022-07-13T13:04:15Z) - Anti-aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation [66.85202434812942]
We reformulate few-shot segmentation as a semantic reconstruction problem.
We convert base class features into a series of basis vectors which span a class-level semantic space for novel class reconstruction.
Our proposed approach, referred to as anti-aliasing semantic reconstruction (ASR), provides a systematic yet interpretable solution for few-shot learning problems.
arXiv Detail & Related papers (2021-06-01T02:17:36Z) - Invariant Deep Compressible Covariance Pooling for Aerial Scene
Categorization [80.55951673479237]
We propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization.
We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-11-11T11:13:07Z) - Plannable Approximations to MDP Homomorphisms: Equivariance under
Actions [72.30921397899684]
We introduce a contrastive loss function that enforces action equivariance on the learned representations.
We prove that when our loss is zero, we have a homomorphism of a deterministic Markov Decision Process.
We show experimentally that for deterministic MDPs, the optimal policy in the abstract MDP can be successfully lifted to the original MDP.
arXiv Detail & Related papers (2020-02-27T08:29:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.