On Feature Decorrelation in Self-Supervised Learning
- URL: http://arxiv.org/abs/2105.00470v1
- Date: Sun, 2 May 2021 13:28:18 GMT
- Title: On Feature Decorrelation in Self-Supervised Learning
- Authors: Tianyu Hua, Wenxiao Wang, Zihui Xue, Yue Wang, Sucheng Ren, Hang Zhao
- Abstract summary: We study a framework containing the most common components from recent approaches.
We connect dimensional collapse with strong correlations between axes and consider such connection as a strong motivation for feature decorrelation.
- Score: 15.555208840500086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In self-supervised representation learning, a common idea behind most of the
state-of-the-art approaches is to enforce the robustness of the representations
to predefined augmentations. A potential issue of this idea is the existence of
completely collapsed solutions (i.e., constant features), which are typically
avoided implicitly by carefully chosen implementation details. In this work, we
study a relatively concise framework containing the most common components from
recent approaches. We verify the existence of complete collapse and discover
another reachable collapse pattern that is usually overlooked, namely
dimensional collapse. We connect dimensional collapse with strong correlations
between axes and consider such connection as a strong motivation for feature
decorrelation (i.e., standardizing the covariance matrix). The capability of
correlation as an unsupervised metric and the gains from feature decorrelation
are verified empirically to highlight the importance and the potential of this
insight.
Related papers
- Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning [26.34622544479565]
Causal dynamics learning is a promising approach to enhancing robustness in reinforcement learning.
We propose a novel model that infers fine-grained causal structures and employs them for prediction.
arXiv Detail & Related papers (2024-06-05T13:13:58Z) - Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse
Actions, Interventions and Sparse Temporal Dependencies [58.179981892921056]
This work introduces a novel principle for disentanglement we call mechanism sparsity regularization.
We propose a representation learning method that induces disentanglement by simultaneously learning the latent factors.
We show that the latent factors can be recovered by regularizing the learned causal graph to be sparse.
arXiv Detail & Related papers (2024-01-10T02:38:21Z) - Entity or Relation Embeddings? An Analysis of Encoding Strategies for Relation Extraction [19.019881161010474]
Relation extraction is essentially a text classification problem, which can be tackled by fine-tuning a pre-trained language model (LM)
Existing approaches therefore solve the problem in an indirect way: they fine-tune an LM to learn embeddings of the head and tail entities, and then predict the relationship from these entity embeddings.
Our hypothesis in this paper is that relation extraction models can be improved by capturing relationships in a more direct way.
arXiv Detail & Related papers (2023-12-18T09:58:19Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet)
We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z) - Generalizable Information Theoretic Causal Representation [37.54158138447033]
We propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph.
The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability.
arXiv Detail & Related papers (2022-02-17T00:38:35Z) - Leveraging Unlabeled Data for Entity-Relation Extraction through
Probabilistic Constraint Satisfaction [54.06292969184476]
We study the problem of entity-relation extraction in the presence of symbolic domain knowledge.
Our approach employs semantic loss which captures the precise meaning of a logical sentence.
With a focus on low-data regimes, we show that semantic loss outperforms the baselines by a wide margin.
arXiv Detail & Related papers (2021-03-20T00:16:29Z) - Disentangling Observed Causal Effects from Latent Confounders using
Method of Moments [67.27068846108047]
We provide guarantees on identifiability and learnability under mild assumptions.
We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions.
arXiv Detail & Related papers (2021-01-17T07:48:45Z) - Learning Causal Models Online [103.87959747047158]
Predictive models can rely on spurious correlations in the data for making predictions.
One solution for achieving strong generalization is to incorporate causal structures in the models.
We propose an online algorithm that continually detects and removes spurious features.
arXiv Detail & Related papers (2020-06-12T20:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.