On the Dynamics of Inference and Learning
- URL: http://arxiv.org/abs/2204.12939v1
- Date: Tue, 19 Apr 2022 18:04:36 GMT
- Title: On the Dynamics of Inference and Learning
- Authors: David S. Berman, Jonathan J. Heckman, Marc Klinger
- Abstract summary: We present a treatment of this Bayesian updating process as a continuous dynamical system.
We show that when the Cram'er-Rao bound is saturated the learning rate is governed by a simple $1/T$ power-law.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Statistical Inference is the process of determining a probability
distribution over the space of parameters of a model given a data set. As more
data becomes available this probability distribution becomes updated via the
application of Bayes' theorem. We present a treatment of this Bayesian updating
process as a continuous dynamical system. Statistical inference is then
governed by a first order differential equation describing a trajectory or flow
in the information geometry determined by a parametric family of models. We
solve this equation for some simple models and show that when the
Cram\'{e}r-Rao bound is saturated the learning rate is governed by a simple
$1/T$ power-law, with $T$ a time-like variable denoting the quantity of data.
The presence of hidden variables can be incorporated in this setting, leading
to an additional driving term in the resulting flow equation. We illustrate
this with both analytic and numerical examples based on Gaussians and Gaussian
Random Processes and inference of the coupling constant in the 1D Ising model.
Finally we compare the qualitative behaviour exhibited by Bayesian flows to the
training of various neural networks on benchmarked data sets such as MNIST and
CIFAR10 and show how that for networks exhibiting small final losses the simple
power-law is also satisfied.
Related papers
- von Mises Quasi-Processes for Bayesian Circular Regression [57.88921637944379]
We explore a family of expressive and interpretable distributions over circle-valued random functions.
The resulting probability model has connections with continuous spin models in statistical physics.
For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Markov Chain Monte Carlo sampling.
arXiv Detail & Related papers (2024-06-19T01:57:21Z) - Fusion of Gaussian Processes Predictions with Monte Carlo Sampling [61.31380086717422]
In science and engineering, we often work with models designed for accurate prediction of variables of interest.
Recognizing that these models are approximations of reality, it becomes desirable to apply multiple models to the same data and integrate their outcomes.
arXiv Detail & Related papers (2024-03-03T04:21:21Z) - Bayesian Flow Networks [4.585102332532472]
This paper introduces Bayesian Flow Networks (BFNs), a new class of generative model in which the parameters of a set of independent distributions are modified with Bayesian inference.
Starting from a simple prior and iteratively updating the two distributions yields a generative procedure similar to the reverse process of diffusion models.
BFNs achieve competitive log-likelihoods for image modelling on dynamically binarized MNIST and CIFAR-10, and outperform all known discrete diffusion models on the text8 character-level language modelling task.
arXiv Detail & Related papers (2023-08-14T09:56:35Z) - A probabilistic, data-driven closure model for RANS simulations with aleatoric, model uncertainty [1.8416014644193066]
We propose a data-driven, closure model for Reynolds-averaged Navier-Stokes (RANS) simulations that incorporates aleatoric, model uncertainty.
A fully Bayesian formulation is proposed, combined with a sparsity-inducing prior in order to identify regions in the problem domain where the parametric closure is insufficient.
arXiv Detail & Related papers (2023-07-05T16:53:31Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Amortised inference of fractional Brownian motion with linear
computational complexity [0.0]
We introduce a simulation-based, amortised Bayesian inference scheme to infer the parameters of random walks.
Our approach learns the posterior distribution of the walks' parameters with a likelihood-free method.
We adapt this scheme to show that a finite decorrelation time in the environment can furthermore be inferred from individual trajectories.
arXiv Detail & Related papers (2022-03-15T14:43:16Z) - Learning Summary Statistics for Bayesian Inference with Autoencoders [58.720142291102135]
We use the inner dimension of deep neural network based Autoencoders as summary statistics.
To create an incentive for the encoder to encode all the parameter-related information but not the noise, we give the decoder access to explicit or implicit information that has been used to generate the training data.
arXiv Detail & Related papers (2022-01-28T12:00:31Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - Variational Mixture of Normalizing Flows [0.0]
Deep generative models, such as generative adversarial networks autociteGAN, variational autoencoders autocitevaepaper, and their variants, have seen wide adoption for the task of modelling complex data distributions.
Normalizing flows have overcome this limitation by leveraging the change-of-suchs formula for probability density functions.
The present work overcomes this by using normalizing flows as components in a mixture model and devising an end-to-end training procedure for such a model.
arXiv Detail & Related papers (2020-09-01T17:20:08Z) - Learned Factor Graphs for Inference from Stationary Time Sequences [107.63351413549992]
We propose a framework that combines model-based algorithms and data-driven ML tools for stationary time sequences.
neural networks are developed to separately learn specific components of a factor graph describing the distribution of the time sequence.
We present an inference algorithm based on learned stationary factor graphs, which learns to implement the sum-product scheme from labeled data.
arXiv Detail & Related papers (2020-06-05T07:06:19Z) - Variational inference formulation for a model-free simulation of a
dynamical system with unknown parameters by a recurrent neural network [8.616180927172548]
We propose a "model-free" simulation of a dynamical system with unknown parameters without prior knowledge.
The deep learning model aims to jointly learn the nonlinear time marching operator and the effects of the unknown parameters from a time series dataset.
It is found that the proposed deep learning model is capable of correctly identifying the dimensions of the random parameters and learning a representation of complex time series data.
arXiv Detail & Related papers (2020-03-02T20:57:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.