Towards efficient representation identification in supervised learning
- URL: http://arxiv.org/abs/2204.04606v1
- Date: Sun, 10 Apr 2022 05:34:13 GMT
- Title: Towards efficient representation identification in supervised learning
- Authors: Kartik Ahuja, Divyat Mahajan, Vasilis Syrgkanis, Ioannis Mitliagkas
- Abstract summary: Humans have a remarkable ability to disentangle complex sensory inputs.
We show theoretically and experimentally that disentanglement is possible even when the auxiliary information dimension is much less than the dimension of the true latent representation.
- Score: 32.3875659102181
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans have a remarkable ability to disentangle complex sensory inputs (e.g.,
image, text) into simple factors of variation (e.g., shape, color) without much
supervision. This ability has inspired many works that attempt to solve the
following question: how do we invert the data generation process to extract
those factors with minimal or no supervision? Several works in the literature
on non-linear independent component analysis have established this negative
result; without some knowledge of the data generation process or appropriate
inductive biases, it is impossible to perform this inversion. In recent years,
a lot of progress has been made on disentanglement under structural
assumptions, e.g., when we have access to auxiliary information that makes the
factors of variation conditionally independent. However, existing work requires
a lot of auxiliary information, e.g., in supervised classification, it
prescribes that the number of label classes should be at least equal to the
total dimension of all factors of variation. In this work, we depart from these
assumptions and ask: a) How can we get disentanglement when the auxiliary
information does not provide conditional independence over the factors of
variation? b) Can we reduce the amount of auxiliary information required for
disentanglement? For a class of models where auxiliary information does not
ensure conditional independence, we show theoretically and experimentally that
disentanglement (to a large extent) is possible even when the auxiliary
information dimension is much less than the dimension of the true latent
representation.
Related papers
- Continual Learning of Nonlinear Independent Representations [17.65617189829692]
We show that model identifiability progresses from a subspace level to a component-wise level as the number of distributions increases.
Our method achieves performance comparable to nonlinear ICA methods trained jointly on multiple offline distributions.
arXiv Detail & Related papers (2024-08-11T14:33:37Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - Leveraging sparse and shared feature activations for disentangled
representation learning [112.22699167017471]
We propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common disentangled representation.
We validate our approach on six real world distribution shift benchmarks, and different data modalities.
arXiv Detail & Related papers (2023-04-17T01:33:24Z) - DOT-VAE: Disentangling One Factor at a Time [1.6114012813668934]
We propose a novel framework which augments the latent space of a Variational Autoencoders with a disentangled space and is trained using a Wake-Sleep-inspired two-step algorithm for unsupervised disentanglement.
Our network learns to disentangle interpretable, independent factors from the data one at a time", and encode it in different dimensions of the disentangled latent space, while making no prior assumptions about the number of factors or their joint distribution.
arXiv Detail & Related papers (2022-10-19T22:53:02Z) - Interventional Causal Representation Learning [75.18055152115586]
Causal representation learning seeks to extract high-level latent factors from low-level sensory data.
Can interventional data facilitate causal representation learning?
We show that interventional data often carries geometric signatures of the latent factors' support.
arXiv Detail & Related papers (2022-09-24T04:59:03Z) - Leveraging Relational Information for Learning Weakly Disentangled
Representations [11.460692362624533]
Disentanglement is a difficult property to enforce in neural representations.
We present an alternative view over learning (weakly) disentangled representations.
arXiv Detail & Related papers (2022-05-20T09:58:51Z) - Exploiting Independent Instruments: Identification and Distribution
Generalization [3.701112941066256]
We exploit the independence for distribution generalization by taking into account higher moments.
We prove that the proposed estimator is invariant to distributional shifts on the instruments.
These results hold even in the under-identified case where the instruments are not sufficiently rich to identify the causal function.
arXiv Detail & Related papers (2022-02-03T21:49:04Z) - Is Disentanglement enough? On Latent Representations for Controllable
Music Generation [78.8942067357231]
In the absence of a strong generative decoder, disentanglement does not necessarily imply controllability.
The structure of the latent space with respect to the VAE-decoder plays an important role in boosting the ability of a generative model to manipulate different attributes.
arXiv Detail & Related papers (2021-08-01T18:37:43Z) - Where and What? Examining Interpretable Disentangled Representations [96.32813624341833]
Capturing interpretable variations has long been one of the goals in disentanglement learning.
Unlike the independence assumption, interpretability has rarely been exploited to encourage disentanglement in the unsupervised setting.
In this paper, we examine the interpretability of disentangled representations by investigating two questions: where to be interpreted and what to be interpreted.
arXiv Detail & Related papers (2021-04-07T11:22:02Z) - A Sober Look at the Unsupervised Learning of Disentangled
Representations and their Evaluation [63.042651834453544]
We show that the unsupervised learning of disentangled representations is impossible without inductive biases on both the models and the data.
We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision.
Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision.
arXiv Detail & Related papers (2020-10-27T10:17:15Z) - Fairness Under Feature Exemptions: Counterfactual and Observational
Measures [34.5472206536785]
We propose an information-theoretic decomposition of the total disparity into two components.
A non-exempt component quantifies the part that cannot be accounted for by the critical features, and an exempt component quantifies the remaining disparity.
We perform case studies to show how one can audit/train models while reducing non-exempt disparity.
arXiv Detail & Related papers (2020-06-14T19:14:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.