A theory of independent mechanisms for extrapolation in generative
models
- URL: http://arxiv.org/abs/2004.00184v2
- Date: Fri, 31 Dec 2021 18:33:04 GMT
- Title: A theory of independent mechanisms for extrapolation in generative
models
- Authors: Michel Besserve, R\'emy Sun, Dominik Janzing and Bernhard Sch\"olkopf
- Abstract summary: Generative models can be trained to emulate complex empirical data, but are they useful to make predictions in the context of previously unobserved environments?
We develop a theoretical framework to address this challenging situation by defining a weaker form of identifiability, based on the principle of independence of mechanisms.
We demonstrate on toy examples that classical gradient descent can hinder the model's extrapolation capabilities, suggesting independence of mechanisms should be enforced explicitly during training.
- Score: 20.794692397859755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative models can be trained to emulate complex empirical data, but are
they useful to make predictions in the context of previously unobserved
environments? An intuitive idea to promote such extrapolation capabilities is
to have the architecture of such model reflect a causal graph of the true data
generating process, such that one can intervene on each node independently of
the others. However, the nodes of this graph are usually unobserved, leading to
overparameterization and lack of identifiability of the causal structure. We
develop a theoretical framework to address this challenging situation by
defining a weaker form of identifiability, based on the principle of
independence of mechanisms. We demonstrate on toy examples that classical
stochastic gradient descent can hinder the model's extrapolation capabilities,
suggesting independence of mechanisms should be enforced explicitly during
training. Experiments on deep generative models trained on real world data
support these insights and illustrate how the extrapolation capabilities of
such models can be leveraged.
Related papers
- Learning Discrete Concepts in Latent Hierarchical Models [73.01229236386148]
Learning concepts from natural high-dimensional data holds potential in building human-aligned and interpretable machine learning models.
We formalize concepts as discrete latent causal variables that are related via a hierarchical causal model.
We substantiate our theoretical claims with synthetic data experiments.
arXiv Detail & Related papers (2024-06-01T18:01:03Z) - Likelihood Based Inference in Fully and Partially Observed Exponential Family Graphical Models with Intractable Normalizing Constants [4.532043501030714]
Probabilistic graphical models that encode an underlying Markov random field are fundamental building blocks of generative modeling.
This paper is to demonstrate that full likelihood based analysis of these models is feasible in a computationally efficient manner.
arXiv Detail & Related papers (2024-04-27T02:58:22Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - Bayesian Networks for the robust and unbiased prediction of depression
and its symptoms utilizing speech and multimodal data [65.28160163774274]
We apply a Bayesian framework to capture the relationships between depression, depression symptoms, and features derived from speech, facial expression and cognitive game data collected at thymia.
arXiv Detail & Related papers (2022-11-09T14:48:13Z) - Bias-inducing geometries: an exactly solvable data model with fairness
implications [13.690313475721094]
We introduce an exactly solvable high-dimensional model of data imbalance.
We analytically unpack the typical properties of learning models trained in this synthetic framework.
We obtain exact predictions for the observables that are commonly employed for fairness assessment.
arXiv Detail & Related papers (2022-05-31T16:27:57Z) - Properties from Mechanisms: An Equivariance Perspective on Identifiable
Representation Learning [79.4957965474334]
Key goal of unsupervised representation learning is "inverting" a data generating process to recover its latent properties.
This paper asks, "Can we instead identify latent properties by leveraging knowledge of the mechanisms that govern their evolution?"
We provide a complete characterization of the sources of non-identifiability as we vary knowledge about a set of possible mechanisms.
arXiv Detail & Related papers (2021-10-29T14:04:08Z) - Instance-Based Neural Dependency Parsing [56.63500180843504]
We develop neural models that possess an interpretable inference process for dependency parsing.
Our models adopt instance-based inference, where dependency edges are extracted and labeled by comparing them to edges in a training set.
arXiv Detail & Related papers (2021-09-28T05:30:52Z) - Discovering Latent Causal Variables via Mechanism Sparsity: A New
Principle for Nonlinear ICA [81.4991350761909]
Independent component analysis (ICA) refers to an ensemble of methods which formalize this goal and provide estimation procedure for practical application.
We show that the latent variables can be recovered up to a permutation if one regularizes the latent mechanisms to be sparse.
arXiv Detail & Related papers (2021-07-21T14:22:14Z) - Learning Opinion Dynamics From Social Traces [25.161493874783584]
We propose an inference mechanism for fitting a generative, agent-like model of opinion dynamics to real-world social traces.
We showcase our proposal by translating a classical agent-based model of opinion dynamics into its generative counterpart.
We apply our model to real-world data from Reddit to explore the long-standing question about the impact of backfire effect.
arXiv Detail & Related papers (2020-06-02T14:48:17Z) - Structural Regularization [0.0]
We propose a novel method for modeling data by using structural models based on economic theory as regularizers for statistical models.
We show that our method can outperform both the (misspecified) structural model and un-structural-regularized statistical models.
arXiv Detail & Related papers (2020-04-27T06:47:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.