Curvature-Sensitive Predictive Coding with Approximate Laplace Monte
Carlo
- URL: http://arxiv.org/abs/2303.04976v1
- Date: Thu, 9 Mar 2023 01:29:58 GMT
- Title: Curvature-Sensitive Predictive Coding with Approximate Laplace Monte
Carlo
- Authors: Umais Zahid, Qinghai Guo, Karl Friston, Zafeirios Fountas
- Abstract summary: Predictive coding (PC) accounts of perception now form one of the dominant computational theories of the brain.
Despite this, they have enjoyed little export to the broader field of machine learning.
This has been due to the poor performance of models trained with PC when evaluated by both sample quality and marginal likelihood.
- Score: 1.1470070927586016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predictive coding (PC) accounts of perception now form one of the dominant
computational theories of the brain, where they prescribe a general algorithm
for inference and learning over hierarchical latent probabilistic models.
Despite this, they have enjoyed little export to the broader field of machine
learning, where comparative generative modelling techniques have flourished. In
part, this has been due to the poor performance of models trained with PC when
evaluated by both sample quality and marginal likelihood. By adopting the
perspective of PC as a variational Bayes algorithm under the Laplace
approximation, we identify the source of these deficits to lie in the exclusion
of an associated Hessian term in the PC objective function, which would
otherwise regularise the sharpness of the probability landscape and prevent
over-certainty in the approximate posterior. To remedy this, we make three
primary contributions: we begin by suggesting a simple Monte Carlo estimated
evidence lower bound which relies on sampling from the Hessian-parameterised
variational posterior. We then derive a novel block diagonal approximation to
the full Hessian matrix that has lower memory requirements and favourable
mathematical properties. Lastly, we present an algorithm that combines our
method with standard PC to reduce memory complexity further. We evaluate models
trained with our approach against the standard PC framework on image benchmark
datasets. Our approach produces higher log-likelihoods and qualitatively better
samples that more closely capture the diversity of the data-generating
distribution.
Related papers
- A sparse PAC-Bayesian approach for high-dimensional quantile prediction [0.0]
This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction.
It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation.
Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.
arXiv Detail & Related papers (2024-09-03T08:01:01Z) - Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Large-scale Bayesian Structure Learning for Gaussian Graphical Models using Marginal Pseudo-likelihood [0.26249027950824516]
We introduce two novel Markov chain Monte Carlo (MCMC) search algorithms with a significantly lower computational cost than leading Bayesian approaches.
These algorithms can deliver reliable results in mere minutes on standard computers, even for large-scale problems with one thousand variables.
We also illustrate the practical utility of our methods on medium and large-scale applications from human and mice gene expression studies.
arXiv Detail & Related papers (2023-06-30T20:37:40Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP,
and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making.
We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation.
We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z) - Langevin Monte Carlo for Contextual Bandits [72.00524614312002]
Langevin Monte Carlo Thompson Sampling (LMC-TS) is proposed to directly sample from the posterior distribution in contextual bandits.
We prove that the proposed algorithm achieves the same sublinear regret bound as the best Thompson sampling algorithms for a special case of contextual bandits.
arXiv Detail & Related papers (2022-06-22T17:58:23Z) - Posterior and Computational Uncertainty in Gaussian Processes [52.26904059556759]
Gaussian processes scale prohibitively with the size of the dataset.
Many approximation methods have been developed, which inevitably introduce approximation error.
This additional source of uncertainty, due to limited computation, is entirely ignored when using the approximate posterior.
We develop a new class of methods that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended.
arXiv Detail & Related papers (2022-05-30T22:16:25Z) - Last Layer Marginal Likelihood for Invariance Learning [12.00078928875924]
We introduce a new lower bound to the marginal likelihood, which allows us to perform inference for a larger class of likelihood functions.
We work towards bringing this approach to neural networks by using an architecture with a Gaussian process in the last layer.
arXiv Detail & Related papers (2021-06-14T15:40:51Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - An adaptive Hessian approximated stochastic gradient MCMC method [12.93317525451798]
We present an adaptive Hessian approximated gradient MCMC method to incorporate local geometric information while sampling from the posterior.
We adopt a magnitude-based weight pruning method to enforce the sparsity of the network.
arXiv Detail & Related papers (2020-10-03T16:22:15Z) - Bayesian System ID: Optimal management of parameter, model, and
measurement uncertainty [0.0]
We evaluate the robustness of a probabilistic formulation of system identification (ID) to sparse, noisy, and indirect data.
We show that the log posterior has improved geometric properties compared with the objective function surfaces of traditional methods.
arXiv Detail & Related papers (2020-03-04T22:48:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.