I Don't Need $\mathbf{u}$: Identifiable Non-Linear ICA Without Side
Information
- URL: http://arxiv.org/abs/2106.05238v1
- Date: Wed, 9 Jun 2021 17:22:08 GMT
- Title: I Don't Need $\mathbf{u}$: Identifiable Non-Linear ICA Without Side
Information
- Authors: Matthew Willetts, Brooks Paige
- Abstract summary: We introduce a new approach for identifiable non-linear ICA models.
In particular, we focus on generative models which perform clustering in their latent space.
- Score: 13.936583337756883
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work we introduce a new approach for identifiable non-linear ICA
models. Recently there has been a renaissance in identifiability results in
deep generative models, not least for non-linear ICA. These prior works,
however, have assumed access to a sufficiently-informative auxiliary set of
observations, denoted $\mathbf{u}$. We show here how identifiability can be
obtained in the absence of this side-information, rendering possible
fully-unsupervised identifiable non-linear ICA. While previous theoretical
results have established the impossibility of identifiable non-linear ICA in
the presence of infinitely-flexible universal function approximators, here we
rely on the intrinsically-finite modelling capacity of any particular chosen
parameterisation of a deep generative model. In particular, we focus on
generative models which perform clustering in their latent space -- a model
structure which matches previous identifiable models, but with the learnt
clustering providing a synthetic form of auxiliary information. We evaluate our
proposals using VAEs, on synthetic and image datasets, and find that the
learned clusterings function effectively: deep generative models with latent
clusterings are empirically identifiable, to the same degree as models which
rely on side information.
Related papers
- BINDy -- Bayesian identification of nonlinear dynamics with reversible-jump Markov-chain Monte-Carlo [0.0]
Model parsimony is an important emphcognitive bias in data-driven modelling that aids interpretability and helps to prevent over-fitting.
Sparse identification of nonlinear dynamics (SINDy) methods are able to learn sparse representations of complex dynamics directly from data.
A novel Bayesian treatment of dictionary learning system identification, as an alternative to SINDy, is envisaged.
arXiv Detail & Related papers (2024-08-15T10:03:30Z) - Towards a mathematical understanding of learning from few examples with
nonlinear feature maps [68.8204255655161]
We consider the problem of data classification where the training set consists of just a few data points.
We reveal key relationships between the geometry of an AI model's feature space, the structure of the underlying data distributions, and the model's generalisation capabilities.
arXiv Detail & Related papers (2022-11-07T14:52:58Z) - Learning from few examples with nonlinear feature maps [68.8204255655161]
We explore the phenomenon and reveal key relationships between dimensionality of AI model's feature space, non-degeneracy of data distributions, and the model's generalisation capabilities.
The main thrust of our present analysis is on the influence of nonlinear feature transformations mapping original data into higher- and possibly infinite-dimensional spaces on the resulting model's generalisation capabilities.
arXiv Detail & Related papers (2022-03-31T10:36:50Z) - Model-agnostic multi-objective approach for the evolutionary discovery
of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results.
We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - Disentangling Identifiable Features from Noisy Data with Structured
Nonlinear ICA [4.340954888479091]
We introduce a new general identifiable framework for principled disentanglement referred to as Structured Independent Component Analysis (SNICA)
Our contribution is to extend the identifiability theory of deep generative models for a very broad class of structured models.
We establish the major result that identifiability for this framework holds even in the presence of noise of unknown distribution.
arXiv Detail & Related papers (2021-06-17T15:56:57Z) - On Linear Identifiability of Learned Representations [26.311880922890843]
We study identifiability in the context of representation learning.
We show that a large family of discriminative models are identifiable in function space, up to a linear indeterminacy.
We derive sufficient conditions for linear identifiability and provide empirical support for the result on both simulated and real-world data.
arXiv Detail & Related papers (2020-07-01T23:33:37Z) - Hidden Markov Nonlinear ICA: Unsupervised Learning from Nonstationary
Time Series [0.0]
We show how to combine nonlinear Independent Component Analysis with a Hidden Markov Model.
We prove identifiability of the proposed model for a general mixing nonlinearity, such as a neural network.
We achieve a new nonlinear ICA framework which is unsupervised, more efficient, as well as able to model underlying temporal dynamics.
arXiv Detail & Related papers (2020-06-22T10:01:15Z) - Bayesian Sparse Factor Analysis with Kernelized Observations [67.60224656603823]
Multi-view problems can be faced with latent variable models.
High-dimensionality and non-linear issues are traditionally handled by kernel methods.
We propose merging both approaches into single model.
arXiv Detail & Related papers (2020-06-01T14:25:38Z) - ICE-BeeM: Identifiable Conditional Energy-Based Deep Models Based on
Nonlinear ICA [11.919315372249802]
We consider the identifiability theory of probabilistic models.
We show that our model can be used for the estimation of the components in the framework of Independently Modulated Component Analysis.
arXiv Detail & Related papers (2020-02-26T14:43:30Z) - Learning Bijective Feature Maps for Linear ICA [73.85904548374575]
We show that existing probabilistic deep generative models (DGMs) which are tailor-made for image data, underperform on non-linear ICA tasks.
To address this, we propose a DGM which combines bijective feature maps with a linear ICA model to learn interpretable latent structures for high-dimensional data.
We create models that converge quickly, are easy to train, and achieve better unsupervised latent factor discovery than flow-based models, linear ICA, and Variational Autoencoders on images.
arXiv Detail & Related papers (2020-02-18T17:58:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.