Related papers: Credit Assignment Through Broadcasting a Global Error Vector

Credit Assignment Through Broadcasting a Global Error Vector

URL: http://arxiv.org/abs/2106.04089v2
Date: Thu, 28 Oct 2021 20:31:37 GMT
Title: Credit Assignment Through Broadcasting a Global Error Vector
Authors: David G. Clark, L. F. Abbott, SueYeon Chung
Abstract summary: Backpropagation (BP) uses detailed, unit-specific feedback to train deep neural networks (DNNs) with remarkable success. Here, we explore the extent to which a globally broadcast learning signal, coupled with local weight updates, enables training of DNNs.
Score: 4.683806391173103
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Backpropagation (BP) uses detailed, unit-specific feedback to train deep neural networks (DNNs) with remarkable success. That biological neural circuits appear to perform credit assignment, but cannot implement BP, implies the existence of other powerful learning algorithms. Here, we explore the extent to which a globally broadcast learning signal, coupled with local weight updates, enables training of DNNs. We present both a learning rule, called global error-vector broadcasting (GEVB), and a class of DNNs, called vectorized nonnegative networks (VNNs), in which this learning rule operates. VNNs have vector-valued units and nonnegative weights past the first layer. The GEVB learning rule generalizes three-factor Hebbian learning, updating each weight by an amount proportional to the inner product of the presynaptic activation and a globally broadcast error vector when the postsynaptic unit is active. We prove that these weight updates are matched in sign to the gradient, enabling accurate credit assignment. Moreover, at initialization, these updates are exactly proportional to the gradient in the limit of infinite network width. GEVB matches the performance of BP in VNNs, and in some cases outperforms direct feedback alignment (DFA) applied in conventional networks. Unlike DFA, GEVB successfully trains convolutional layers. Altogether, our theoretical and empirical results point to a surprisingly powerful role for a global learning signal in training DNNs.

Related papers

Towards Experience Replay for Class-Incremental Learning in Fully-Binary Networks [1.3980986259786223]
This paper goes a step further, enabling class incremental learning in Fully-Binarized NNs (FBNNs) We revisit the FBNN design and its training procedure that is suitable to CIL. Thirdly, we propose a semi-supervised method to pre-train the feature extractor of the FBNN for transferable representations.
arXiv Detail & Related papers (2025-03-10T09:31:32Z)
DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment [57.62885438406724]
Graph neural networks are recognized for their strong performance across various applications. BP has limitations that challenge its biological plausibility and affect the efficiency, scalability and parallelism of training neural networks for graph-based tasks. We propose DFA-GNN, a novel forward learning framework tailored for GNNs with a case study of semi-supervised learning.
arXiv Detail & Related papers (2024-06-04T07:24:51Z)
Forward Learning of Graph Neural Networks [17.79590285482424]
Backpropagation (BP) is the de facto standard for training deep neural networks (NNs) BP imposes several constraints, which are not only biologically implausible, but also limit the scalability, parallelism, and flexibility in learning NNs. We propose ForwardGNN, which avoids the constraints imposed by BP via an effective layer-wise local forward training.
arXiv Detail & Related papers (2024-03-16T19:40:35Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Neural networks trained with SGD learn distributions of increasing complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics. We then exploit higher-order statistics only later during training. We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z)
Learning Ability of Interpolating Deep Convolutional Neural Networks [28.437011792990347]
We study the learning ability of an important family of deep neural networks, deep convolutional neural networks (DCNNs) We show that by adding well-defined layers to a non-interpolating DCNN, we can obtain some interpolating DCNNs that maintain the good learning rates of the non-interpolating DCNN. Our work provides theoretical verification of how overfitted DCNNs generalize well.
arXiv Detail & Related papers (2022-10-25T17:22:31Z)
On Feature Learning in Neural Networks with Global Convergence Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF) We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF. We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z)
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution. Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs. Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z)
Statistical Mechanics of Deep Linear Neural Networks: The Back-Propagating Renormalization Group [4.56877715768796]
We study the statistical mechanics of learning in Deep Linear Neural Networks (DLNNs) in which the input-output function of an individual unit is linear. We solve exactly the network properties following supervised learning using an equilibrium Gibbs distribution in the weight space. Our numerical simulations reveal that despite the nonlinearity, the predictions of our theory are largely shared by ReLU networks with modest depth.
arXiv Detail & Related papers (2020-12-07T20:08:31Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Extension of Direct Feedback Alignment to Convolutional and Recurrent Neural Network for Bio-plausible Deep Learning [0.0]
We focus on the improvement of the direct feedback alignment (DFA) algorithm. We extend the usage of the DFA to convolutional and recurrent neural networks (CNNs and RNNs) We propose a new DFA algorithm for BP-level accurate CNN and RNN training.
arXiv Detail & Related papers (2020-06-23T08:42:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.