Kernelized information bottleneck leads to biologically plausible
3-factor Hebbian learning in deep networks
- URL: http://arxiv.org/abs/2006.07123v2
- Date: Fri, 23 Oct 2020 17:00:59 GMT
- Title: Kernelized information bottleneck leads to biologically plausible
3-factor Hebbian learning in deep networks
- Authors: Roman Pogodin and Peter E. Latham
- Abstract summary: We present a family of learning rules that does not suffer from any of these problems.
The resulting rules have a 3-factor Hebbian structure.
They do not require precise labels; instead, they rely on the similarity between pairs of desired outputs.
- Score: 6.09170287691728
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The state-of-the art machine learning approach to training deep neural
networks, backpropagation, is implausible for real neural networks: neurons
need to know their outgoing weights; training alternates between a bottom-up
forward pass (computation) and a top-down backward pass (learning); and the
algorithm often needs precise labels of many data points. Biologically
plausible approximations to backpropagation, such as feedback alignment, solve
the weight transport problem, but not the other two. Thus, fully biologically
plausible learning rules have so far remained elusive. Here we present a family
of learning rules that does not suffer from any of these problems. It is
motivated by the information bottleneck principle (extended with kernel
methods), in which networks learn to compress the input as much as possible
without sacrificing prediction of the output. The resulting rules have a
3-factor Hebbian structure: they require pre- and post-synaptic firing rates
and an error signal - the third factor - consisting of a global teaching signal
and a layer-specific term, both available without a top-down pass. They do not
require precise labels; instead, they rely on the similarity between pairs of
desired outputs. Moreover, to obtain good performance on hard problems and
retain biological plausibility, our rules need divisive normalization - a known
feature of biological networks. Finally, simulations show that our rules
perform nearly as well as backpropagation on image classification tasks.
Related papers
- Binary stochasticity enabled highly efficient neuromorphic deep learning
achieves better-than-software accuracy [17.11946381948498]
Deep learning needs high-precision handling of forwarding signals, backpropagating errors, and updating weights.
It is challenging to implement deep learning in hardware systems that use noisy analog memristors as artificial synapses.
We propose a binary learning algorithm that modifies all elementary neural network operations.
arXiv Detail & Related papers (2023-04-25T14:38:36Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - How does unlabeled data improve generalization in self-training? A
one-hidden-layer theoretical analysis [93.37576644429578]
This work establishes the first theoretical analysis for the known iterative self-training paradigm.
We prove the benefits of unlabeled data in both training convergence and generalization ability.
Experiments from shallow neural networks to deep neural networks are also provided to justify the correctness of our established theoretical insights on self-training.
arXiv Detail & Related papers (2022-01-21T02:16:52Z) - Convergence and Alignment of Gradient Descentwith Random Back
propagation Weights [6.338178373376447]
gradient descent with backpropagation is the workhorse of artificial neural networks.
Lillicrap et al. propose a more biologically plausible "feedback alignment" algorithm that uses random and fixed backpropagation weights.
arXiv Detail & Related papers (2021-06-10T20:58:05Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Biologically plausible single-layer networks for nonnegative independent
component analysis [21.646490546361935]
We seek a biologically plausible single-layer neural network implementation of a blind source separation algorithm.
For biological plausibility, we require the network to satisfy the following three basic properties of neuronal circuits.
arXiv Detail & Related papers (2020-10-23T19:31:49Z) - Local plasticity rules can learn deep representations using
self-supervised contrastive predictions [3.6868085124383616]
Learning rules that respect biological constraints, yet yield deep hierarchical representations are still unknown.
We propose a learning rule that takes inspiration from neuroscience and recent advances in self-supervised deep learning.
We find that networks trained with this self-supervised and local rule build deep hierarchical representations of images, speech and video.
arXiv Detail & Related papers (2020-10-16T09:32:35Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - NeuroFabric: Identifying Ideal Topologies for Training A Priori Sparse
Networks [2.398608007786179]
Long training times of deep neural networks are a bottleneck in machine learning research.
We provide a theoretical foundation for the choice of intra-layer topology.
We show that seemingly similar topologies can often have a large difference in attainable accuracy.
arXiv Detail & Related papers (2020-02-19T18:29:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.