InfoNCE: Identifying the Gap Between Theory and Practice
- URL: http://arxiv.org/abs/2407.00143v1
- Date: Fri, 28 Jun 2024 16:08:26 GMT
- Title: InfoNCE: Identifying the Gap Between Theory and Practice
- Authors: Evgenia Rusak, Patrik Reizinger, Attila Juhos, Oliver Bringmann, Roland S. Zimmermann, Wieland Brendel,
- Abstract summary: We introduce AnInfoNCE, a generalization of InfoNCE that can provably uncover the latent factors in anisotropic setting.
We show that AnInfoNCE increases the recovery of previously collapsed information in CIFAR10 and ImageNet, albeit at the cost of downstream accuracy.
- Score: 15.744372232355
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous theoretical work on contrastive learning (CL) with InfoNCE showed that, under certain assumptions, the learned representations uncover the ground-truth latent factors. We argue these theories overlook crucial aspects of how CL is deployed in practice. Specifically, they assume that within a positive pair, all latent factors either vary to a similar extent, or that some do not vary at all. However, in practice, positive pairs are often generated using augmentations such as strong cropping to just a few pixels. Hence, a more realistic assumption is that all latent factors change, with a continuum of variability across these factors. We introduce AnInfoNCE, a generalization of InfoNCE that can provably uncover the latent factors in this anisotropic setting, broadly generalizing previous identifiability results in CL. We validate our identifiability results in controlled experiments and show that AnInfoNCE increases the recovery of previously collapsed information in CIFAR10 and ImageNet, albeit at the cost of downstream accuracy. Additionally, we explore and discuss further mismatches between theoretical assumptions and practical implementations, including extensions to hard negative mining and loss ensembles.
Related papers
- Unifying Causal Representation Learning with the Invariance Principle [21.375611599649716]
Causal representation learning aims at recovering latent causal variables from high-dimensional observations.
Our main contribution is to show that many existing causal representation learning approaches methodologically align the representation to known data symmetries.
arXiv Detail & Related papers (2024-09-04T14:51:36Z) - Local Causal Structure Learning in the Presence of Latent Variables [16.88791886307876]
We present a principled method for determining whether a variable is a direct cause or effect of a target.
Experimental results on both synthetic and real-world data validate the effectiveness and efficiency of our approach.
arXiv Detail & Related papers (2024-05-25T13:31:05Z) - Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - A Sparsity Principle for Partially Observable Causal Representation Learning [28.25303444099773]
Causal representation learning aims at identifying high-level causal variables from perceptual data.
We focus on learning from unpaired observations from a dataset with an instance-dependent partial observability pattern.
We propose two methods for estimating the underlying causal variables by enforcing sparsity in the inferred representation.
arXiv Detail & Related papers (2024-03-13T08:40:49Z) - Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse
Actions, Interventions and Sparse Temporal Dependencies [58.179981892921056]
This work introduces a novel principle for disentanglement we call mechanism sparsity regularization.
We propose a representation learning method that induces disentanglement by simultaneously learning the latent factors.
We show that the latent factors can be recovered by regularizing the learned causal graph to be sparse.
arXiv Detail & Related papers (2024-01-10T02:38:21Z) - A Versatile Causal Discovery Framework to Allow Causally-Related Hidden
Variables [28.51579090194802]
We introduce a novel framework for causal discovery that accommodates the presence of causally-related hidden variables almost everywhere in the causal network.
We develop a Rank-based Latent Causal Discovery algorithm, RLCD, that can efficiently locate hidden variables, determine their cardinalities, and discover the entire causal structure over both measured and hidden ones.
Experimental results on both synthetic and real-world personality data sets demonstrate the efficacy of the proposed approach in finite-sample cases.
arXiv Detail & Related papers (2023-12-18T07:57:39Z) - Identifying Linearly-Mixed Causal Representations from Multi-Node Interventions [14.586959818386765]
We provide the first identifiability result for causal representation learning that allows for multiple variables to be targeted by an intervention within one environment.
Our approach hinges on a general assumption on the coverage and diversity of interventions across environments.
In addition to and inspired by our theoretical contributions, we present a practical algorithm to learn causal representations from multi-node interventional data.
arXiv Detail & Related papers (2023-11-05T16:05:00Z) - C-Disentanglement: Discovering Causally-Independent Generative Factors
under an Inductive Bias of Confounder [35.09708249850816]
We introduce a framework entitled Confounded-Disentanglement (C-Disentanglement), the first framework that explicitly introduces the inductive bias of confounder.
We conduct extensive experiments on both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-10-26T11:44:42Z) - Identifiable Latent Polynomial Causal Models Through the Lens of Change [82.14087963690561]
Causal representation learning aims to unveil latent high-level causal representations from observed low-level data.
One of its primary tasks is to provide reliable assurance of identifying these latent causal models, known as identifiability.
arXiv Detail & Related papers (2023-10-24T07:46:10Z) - A Causal Framework for Decomposing Spurious Variations [68.12191782657437]
We develop tools for decomposing spurious variations in Markovian and Semi-Markovian models.
We prove the first results that allow a non-parametric decomposition of spurious effects.
The described approach has several applications, ranging from explainable and fair AI to questions in epidemiology and medicine.
arXiv Detail & Related papers (2023-06-08T09:40:28Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - Theory on Forgetting and Generalization of Continual Learning [41.85538120246877]
Continual learning (CL) aims to learn a sequence of tasks.
There is a lack of understanding on what factors are important and how they affect "catastrophic forgetting" and generalization performance.
We show that our results not only explain some interesting empirical observations in recent studies, but also motivate better practical algorithm designs of CL.
arXiv Detail & Related papers (2023-02-12T02:14:14Z) - Identifying Weight-Variant Latent Causal Models [82.14087963690561]
We find that transitivity acts as a key role in impeding the identifiability of latent causal representations.
Under some mild assumptions, we can show that the latent causal representations can be identified up to trivial permutation and scaling.
We propose a novel method, termed Structural caUsAl Variational autoEncoder, which directly learns latent causal representations and causal relationships among them.
arXiv Detail & Related papers (2022-08-30T11:12:59Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - A Critical Look At The Identifiability of Causal Effects with Deep
Latent Variable Models [2.326384409283334]
We use causal effect variational autoencoder (CEVAE) as a case study.
CEVAE seems to work reliably under some simple scenarios, but it does not identify the correct causal effect with a misspecified latent variable or a complex data distribution.
Our results show that the question of identifiability cannot be disregarded, and we argue that more attention should be paid to it in future work.
arXiv Detail & Related papers (2021-02-12T17:43:18Z) - Learning Disentangled Representations with Latent Variation
Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations.
Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs.
We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z) - CausalVAE: Structured Causal Disentanglement in Variational Autoencoder [52.139696854386976]
The framework of variational autoencoder (VAE) is commonly used to disentangle independent factors from observations.
We propose a new VAE based framework named CausalVAE, which includes a Causal Layer to transform independent factors into causal endogenous ones.
Results show that the causal representations learned by CausalVAE are semantically interpretable, and their causal relationship as a Directed Acyclic Graph (DAG) is identified with good accuracy.
arXiv Detail & Related papers (2020-04-18T20:09:34Z) - Weakly-Supervised Disentanglement Without Compromises [53.55580957483103]
Intelligent agents should be able to learn useful representations by observing changes in their environment.
We model such observations as pairs of non-i.i.d. images sharing at least one of the underlying factors of variation.
We show that only knowing how many factors have changed, but not which ones, is sufficient to learn disentangled representations.
arXiv Detail & Related papers (2020-02-07T16:39:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.