Supervised structure learning
- URL: http://arxiv.org/abs/2311.10300v1
- Date: Fri, 17 Nov 2023 03:18:55 GMT
- Title: Supervised structure learning
- Authors: Karl J. Friston, Lancelot Da Costa, Alexander Tschantz, Alex Kiefer,
Tommaso Salvatori, Victorita Neacsu, Magnus Koudahl, Conor Heins, Noor Sajid,
Dimitrije Markovic, Thomas Parr, Tim Verbelen, Christopher L Buckley
- Abstract summary: It focuses on Bayesian model selection and the assimilation of training data or content, with a special emphasis on the order in which data are ingested.
A key move - in the ensuing schemes - is to place priors on the selection of models, based upon expected free energy.
The resulting scheme is first used to perform image classification on the MNIST dataset to illustrate the basic idea, and then tested on a more challenging problem of discovering models with dynamics.
- Score: 41.35046208072566
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper concerns structure learning or discovery of discrete generative
models. It focuses on Bayesian model selection and the assimilation of training
data or content, with a special emphasis on the order in which data are
ingested. A key move - in the ensuing schemes - is to place priors on the
selection of models, based upon expected free energy. In this setting, expected
free energy reduces to a constrained mutual information, where the constraints
inherit from priors over outcomes (i.e., preferred outcomes). The resulting
scheme is first used to perform image classification on the MNIST dataset to
illustrate the basic idea, and then tested on a more challenging problem of
discovering models with dynamics, using a simple sprite-based visual
disentanglement paradigm and the Tower of Hanoi (cf., blocks world) problem. In
these examples, generative models are constructed autodidactically to recover
(i.e., disentangle) the factorial structure of latent states - and their
characteristic paths or dynamics.
Related papers
- Generative Flow Networks: Theory and Applications to Structure Learning [7.6872614776094]
This thesis studies the problem of structure learning from a Bayesian perspective.
It introduces Generative Flow Networks (GFlowNets)
GFlowNets treat generation as a sequential decision making problem.
arXiv Detail & Related papers (2025-01-09T17:47:17Z) - A Fixed-Point Approach for Causal Generative Modeling [20.88890689294816]
We propose a novel formalism for describing Structural Causal Models (SCMs) as fixed-point problems on causally ordered variables.
We establish the weakest known conditions for their unique recovery given the topological ordering (TO)
arXiv Detail & Related papers (2024-04-10T12:29:05Z) - Heat Death of Generative Models in Closed-Loop Learning [63.83608300361159]
We study the learning dynamics of generative models that are fed back their own produced content in addition to their original training dataset.
We show that, unless a sufficient amount of external data is introduced at each iteration, any non-trivial temperature leads the model to degenerate.
arXiv Detail & Related papers (2024-04-02T21:51:39Z) - Initial Guessing Bias: How Untrained Networks Favor Some Classes [0.09103230894909536]
We show that the structure of a deep neural network (DNN) can condition the model to assign all predictions to the same class, even before the beginning of training.
We prove that, besides dataset properties, the presence of this phenomenon is influenced by model choices including dataset preprocessing methods.
We highlight theoretical consequences, such as the breakdown of node-permutation symmetry and the violation of self-averaging.
arXiv Detail & Related papers (2023-06-01T15:37:32Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain.
In this work, we build models that rely on the message-passing rule of predictive coding.
We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z) - Robust Model Selection of Gaussian Graphical Models [16.933125281564163]
Noise-corrupted samples present significant challenges in graphical model selection.
We propose an algorithm which provably recovers the underlying graph up to the identified ambiguity.
This information is useful in a range of real-world problems, including power grids, social networks, protein-protein interactions, and neural structures.
arXiv Detail & Related papers (2022-11-10T16:50:50Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.