Credit Assignment Through Broadcasting a Global Error Vector
- URL: http://arxiv.org/abs/2106.04089v2
- Date: Thu, 28 Oct 2021 20:31:37 GMT
- Title: Credit Assignment Through Broadcasting a Global Error Vector
- Authors: David G. Clark, L. F. Abbott, SueYeon Chung
- Abstract summary: Backpropagation (BP) uses detailed, unit-specific feedback to train deep neural networks (DNNs) with remarkable success.
Here, we explore the extent to which a globally broadcast learning signal, coupled with local weight updates, enables training of DNNs.
- Score: 4.683806391173103
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backpropagation (BP) uses detailed, unit-specific feedback to train deep
neural networks (DNNs) with remarkable success. That biological neural circuits
appear to perform credit assignment, but cannot implement BP, implies the
existence of other powerful learning algorithms. Here, we explore the extent to
which a globally broadcast learning signal, coupled with local weight updates,
enables training of DNNs. We present both a learning rule, called global
error-vector broadcasting (GEVB), and a class of DNNs, called vectorized
nonnegative networks (VNNs), in which this learning rule operates. VNNs have
vector-valued units and nonnegative weights past the first layer. The GEVB
learning rule generalizes three-factor Hebbian learning, updating each weight
by an amount proportional to the inner product of the presynaptic activation
and a globally broadcast error vector when the postsynaptic unit is active. We
prove that these weight updates are matched in sign to the gradient, enabling
accurate credit assignment. Moreover, at initialization, these updates are
exactly proportional to the gradient in the limit of infinite network width.
GEVB matches the performance of BP in VNNs, and in some cases outperforms
direct feedback alignment (DFA) applied in conventional networks. Unlike DFA,
GEVB successfully trains convolutional layers. Altogether, our theoretical and
empirical results point to a surprisingly powerful role for a global learning
signal in training DNNs.
Related papers
- DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment [57.62885438406724]
Graph neural networks are recognized for their strong performance across various applications.
BP has limitations that challenge its biological plausibility and affect the efficiency, scalability and parallelism of training neural networks for graph-based tasks.
We propose DFA-GNN, a novel forward learning framework tailored for GNNs with a case study of semi-supervised learning.
arXiv Detail & Related papers (2024-06-04T07:24:51Z) - Forward Learning of Graph Neural Networks [17.79590285482424]
Backpropagation (BP) is the de facto standard for training deep neural networks (NNs)
BP imposes several constraints, which are not only biologically implausible, but also limit the scalability, parallelism, and flexibility in learning NNs.
We propose ForwardGNN, which avoids the constraints imposed by BP via an effective layer-wise local forward training.
arXiv Detail & Related papers (2024-03-16T19:40:35Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Learning Ability of Interpolating Deep Convolutional Neural Networks [28.437011792990347]
We study the learning ability of an important family of deep neural networks, deep convolutional neural networks (DCNNs)
We show that by adding well-defined layers to a non-interpolating DCNN, we can obtain some interpolating DCNNs that maintain the good learning rates of the non-interpolating DCNN.
Our work provides theoretical verification of how overfitted DCNNs generalize well.
arXiv Detail & Related papers (2022-10-25T17:22:31Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - Statistical Mechanics of Deep Linear Neural Networks: The
Back-Propagating Renormalization Group [4.56877715768796]
We study the statistical mechanics of learning in Deep Linear Neural Networks (DLNNs) in which the input-output function of an individual unit is linear.
We solve exactly the network properties following supervised learning using an equilibrium Gibbs distribution in the weight space.
Our numerical simulations reveal that despite the nonlinearity, the predictions of our theory are largely shared by ReLU networks with modest depth.
arXiv Detail & Related papers (2020-12-07T20:08:31Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Extension of Direct Feedback Alignment to Convolutional and Recurrent
Neural Network for Bio-plausible Deep Learning [0.0]
We focus on the improvement of the direct feedback alignment (DFA) algorithm.
We extend the usage of the DFA to convolutional and recurrent neural networks (CNNs and RNNs)
We propose a new DFA algorithm for BP-level accurate CNN and RNN training.
arXiv Detail & Related papers (2020-06-23T08:42:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.