On Linear Identifiability of Learned Representations
- URL: http://arxiv.org/abs/2007.00810v3
- Date: Wed, 8 Jul 2020 03:51:28 GMT
- Title: On Linear Identifiability of Learned Representations
- Authors: Geoffrey Roeder, Luke Metz and Diederik P. Kingma
- Abstract summary: We study identifiability in the context of representation learning.
We show that a large family of discriminative models are identifiable in function space, up to a linear indeterminacy.
We derive sufficient conditions for linear identifiability and provide empirical support for the result on both simulated and real-world data.
- Score: 26.311880922890843
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Identifiability is a desirable property of a statistical model: it implies
that the true model parameters may be estimated to any desired precision, given
sufficient computational resources and data. We study identifiability in the
context of representation learning: discovering nonlinear data representations
that are optimal with respect to some downstream task. When parameterized as
deep neural networks, such representation functions typically lack
identifiability in parameter space, because they are overparameterized by
design. In this paper, building on recent advances in nonlinear ICA, we aim to
rehabilitate identifiability by showing that a large family of discriminative
models are in fact identifiable in function space, up to a linear
indeterminacy. Many models for representation learning in a wide variety of
domains have been identifiable in this sense, including text, images and audio,
state-of-the-art at time of publication. We derive sufficient conditions for
linear identifiability and provide empirical support for the result on both
simulated and real-world data.
Related papers
- Identifiability of a statistical model with two latent vectors: Importance of the dimensionality relation and application to graph embedding [2.6651200086513107]
Identifiability of statistical models is a key notion in unsupervised representation learning.
This paper proposes a statistical model of two latent vectors with single auxiliary data generalizing nonlinear ICA.
Surprisingly, we prove that the indeterminacies of the proposed model has the same as emphlinear ICA under certain conditions.
arXiv Detail & Related papers (2024-05-30T07:11:20Z) - Flow Factorized Representation Learning [109.51947536586677]
We introduce a generative model which specifies a distinct set of latent probability paths that define different input transformations.
We show that our model achieves higher likelihoods on standard representation learning benchmarks while simultaneously being closer to approximately equivariant models.
arXiv Detail & Related papers (2023-09-22T20:15:37Z) - Beyond Convergence: Identifiability of Machine Learning and Deep
Learning Models [0.0]
We investigate the notion of model parameter identifiability through a case study focused on parameter estimation from motion sensor data.
We employ a deep neural network to estimate subject-wise parameters, including mass, stiffness, and equilibrium leg length.
The results show that while certain parameters can be identified from the observation data, others remain unidentifiable.
arXiv Detail & Related papers (2023-07-21T03:40:53Z) - Posterior Collapse and Latent Variable Non-identifiability [54.842098835445]
We propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility.
Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
arXiv Detail & Related papers (2023-01-02T06:16:56Z) - Indeterminacy in Latent Variable Models: Characterization and Strong
Identifiability [3.959606869996233]
We construct a theoretical framework for analyzing the indeterminacies of latent variable models.
We then investigate how we might specify strongly identifiable latent variable models.
arXiv Detail & Related papers (2022-06-02T00:01:27Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - Data-SUITE: Data-centric identification of in-distribution incongruous
examples [81.21462458089142]
Data-SUITE is a data-centric framework to identify incongruous regions of in-distribution (ID) data.
We empirically validate Data-SUITE's performance and coverage guarantees.
arXiv Detail & Related papers (2022-02-17T18:58:31Z) - It's FLAN time! Summing feature-wise latent representations for
interpretability [0.0]
We propose a novel class of structurally-constrained neural networks, which we call FLANs (Feature-wise Latent Additive Networks)
FLANs process each input feature separately, computing for each of them a representation in a common latent space.
These feature-wise latent representations are then simply summed, and the aggregated representation is used for prediction.
arXiv Detail & Related papers (2021-06-18T12:19:33Z) - I Don't Need $\mathbf{u}$: Identifiable Non-Linear ICA Without Side
Information [13.936583337756883]
We introduce a new approach for identifiable non-linear ICA models.
In particular, we focus on generative models which perform clustering in their latent space.
arXiv Detail & Related papers (2021-06-09T17:22:08Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Structural Causal Models Are (Solvable by) Credal Networks [70.45873402967297]
Causal inferences can be obtained by standard algorithms for the updating of credal nets.
This contribution should be regarded as a systematic approach to represent structural causal models by credal networks.
Experiments show that approximate algorithms for credal networks can immediately be used to do causal inference in real-size problems.
arXiv Detail & Related papers (2020-08-02T11:19:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.