Backpropagation at the Infinitesimal Inference Limit of Energy-Based
Models: Unifying Predictive Coding, Equilibrium Propagation, and Contrastive
Hebbian Learning
- URL: http://arxiv.org/abs/2206.02629v1
- Date: Tue, 31 May 2022 20:48:52 GMT
- Title: Backpropagation at the Infinitesimal Inference Limit of Energy-Based
Models: Unifying Predictive Coding, Equilibrium Propagation, and Contrastive
Hebbian Learning
- Authors: Beren Millidge, Yuhang Song, Tommaso Salvatori, Thomas Lukasiewicz,
Rafal Bogacz
- Abstract summary: How the brain performs credit assignment is a fundamental unsolved problem in neuroscience.
Many biologically plausible' algorithms have been proposed, which compute gradients that approximate those computed by backpropagation (BP)
- Score: 41.58529335439799
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How the brain performs credit assignment is a fundamental unsolved problem in
neuroscience. Many `biologically plausible' algorithms have been proposed,
which compute gradients that approximate those computed by backpropagation
(BP), and which operate in ways that more closely satisfy the constraints
imposed by neural circuitry. Many such algorithms utilize the framework of
energy-based models (EBMs), in which all free variables in the model are
optimized to minimize a global energy function. However, in the literature,
these algorithms exist in isolation and no unified theory exists linking them
together. Here, we provide a comprehensive theory of the conditions under which
EBMs can approximate BP, which lets us unify many of the BP approximation
results in the literature (namely, predictive coding, equilibrium propagation,
and contrastive Hebbian learning) and demonstrate that their approximation to
BP arises from a simple and general mathematical property of EBMs at free-phase
equilibrium. This property can then be exploited in different ways with
different energy functions, and these specific choices yield a family of
BP-approximating algorithms, which both includes the known results in the
literature and can be used to derive new ones.
Related papers
- Connection between single-layer Quantum Approximate Optimization
Algorithm interferometry and thermal distributions sampling [0.0]
We extend the theoretical derivation of the amplitudes of the eigenstates, and the Boltzmann distributions generated by single-layer QAOA.
We also review the implications that this behavior has from both a practical and fundamental perspective.
arXiv Detail & Related papers (2023-10-13T15:06:58Z) - On Feature Diversity in Energy-based Models [98.78384185493624]
An energy-based model (EBM) is typically formed of inner-model(s) that learn a combination of the different features to generate an energy mapping for each input configuration.
We extend the probably approximately correct (PAC) theory of EBMs and analyze the effect of redundancy reduction on the performance of EBMs.
arXiv Detail & Related papers (2023-06-02T12:30:42Z) - Sampling with Mollified Interaction Energy Descent [57.00583139477843]
We present a new optimization-based method for sampling called mollified interaction energy descent (MIED)
MIED minimizes a new class of energies on probability measures called mollified interaction energies (MIEs)
We show experimentally that for unconstrained sampling problems our algorithm performs on par with existing particle-based algorithms like SVGD.
arXiv Detail & Related papers (2022-10-24T16:54:18Z) - Sum-of-Squares Relaxations for Information Theory and Variational
Inference [0.0]
We consider extensions of the Shannon relative entropy, referred to as $f$-divergences.
We derive a sequence of convex relaxations for computing these divergences.
We provide more efficient relaxations based on spectral information divergences from quantum information theory.
arXiv Detail & Related papers (2022-06-27T13:22:40Z) - Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models.
PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z) - Hybridized Methods for Quantum Simulation in the Interaction Picture [69.02115180674885]
We provide a framework that allows different simulation methods to be hybridized and thereby improve performance for interaction picture simulations.
Physical applications of these hybridized methods yield a gate complexity scaling as $log2 Lambda$ in the electric cutoff.
For the general problem of Hamiltonian simulation subject to dynamical constraints, these methods yield a query complexity independent of the penalty parameter $lambda$ used to impose an energy cost.
arXiv Detail & Related papers (2021-09-07T20:01:22Z) - Dual Training of Energy-Based Models with Overparametrized Shallow
Neural Networks [41.702175127106784]
Energy-based models (EBMs) are generative models that are usually trained via maximum likelihood estimation.
We propose a dual formulation of an EBMs algorithm in which the particles are sometimes restarted at random samples drawn from the data set, and show that performing these restarts corresponds to a score every step.
These results are illustrated in simple numerical experiments.
arXiv Detail & Related papers (2021-07-11T21:43:18Z) - Predictive Coding Can Do Exact Backpropagation on Any Neural Network [40.51949948934705]
We generalize (IL and) Z-IL by directly defining them on computational graphs.
This is the first biologically plausible algorithm that is shown to be equivalent to BP in the way of updating parameters on any neural network.
arXiv Detail & Related papers (2021-03-08T11:52:51Z) - A Theoretical Framework for Target Propagation [75.52598682467817]
We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP)
Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP.
We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
arXiv Detail & Related papers (2020-06-25T12:07:06Z) - Generalization of the hierarchical equations of motion theory for
efficient calculations with arbitrary correlation functions [0.0]
The hierarchical equations of motion (HEOM) theory is one of the standard methods to rigorously describe open quantum dynamics coupled to harmonic environments.
In this article, we present a new formulation of the HEOM theory including treatments of non-exponential correlation functions.
arXiv Detail & Related papers (2020-03-13T07:07:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.