Layer-wise Feedback Propagation
- URL: http://arxiv.org/abs/2308.12053v1
- Date: Wed, 23 Aug 2023 10:48:28 GMT
- Title: Layer-wise Feedback Propagation
- Authors: Leander Weber, Jim Berend, Alexander Binder, Thomas Wiegand, Wojciech
Samek, Sebastian Lapuschkin
- Abstract summary: We present Layer-wise Feedback Propagation (LFP), a novel training approach for neural-network-like predictors.
LFP assigns rewards to individual connections based on their respective contributions to solving a given task.
We demonstrate its effectiveness in achieving comparable performance to gradient descent on various models and datasets.
- Score: 53.00944147633484
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this paper, we present Layer-wise Feedback Propagation (LFP), a novel
training approach for neural-network-like predictors that utilizes
explainability, specifically Layer-wise Relevance Propagation(LRP), to assign
rewards to individual connections based on their respective contributions to
solving a given task. This differs from traditional gradient descent, which
updates parameters towards anestimated loss minimum. LFP distributes a reward
signal throughout the model without the need for gradient computations. It then
strengthens structures that receive positive feedback while reducingthe
influence of structures that receive negative feedback. We establish the
convergence of LFP theoretically and empirically, and demonstrate its
effectiveness in achieving comparable performance to gradient descent on
various models and datasets. Notably, LFP overcomes certain limitations
associated with gradient-based methods, such as reliance on meaningful
derivatives. We further investigate how the different LRP-rules can be extended
to LFP, what their effects are on training, as well as potential applications,
such as training models with no meaningful derivatives, e.g., step-function
activated Spiking Neural Networks (SNNs), or for transfer learning, to
efficiently utilize existing knowledge.
Related papers
- On Divergence Measures for Training GFlowNets [3.7277730514654555]
Generative Flow Networks (GFlowNets) are amortized inference models designed to sample from unnormalized distributions over composable objects.
Traditionally, the training procedure for GFlowNets seeks to minimize the expected log-squared difference between a proposal (forward policy) and a target (backward policy) distribution.
We review four divergence measures, namely, Renyi-$alpha$'s, Tsallis-$alpha$'s, reverse and forward KL's, and design statistically efficient estimators for their gradients in the context of training GFlowNets
arXiv Detail & Related papers (2024-10-12T03:46:52Z) - Learning Point Spread Function Invertibility Assessment for Image Deconvolution [14.062542012968313]
We propose a metric that employs a non-linear approach to learn the invertibility of an arbitrary PSF using a neural network.
A lower discrepancy between the mapped PSF and a unit impulse indicates a higher likelihood of successful inversion by a DL network.
arXiv Detail & Related papers (2024-05-25T20:00:27Z) - Random Linear Projections Loss for Hyperplane-Based Optimization in Neural Networks [22.348887008547653]
This work introduces Random Linear Projections (RLP) loss, a novel approach that enhances training efficiency by leveraging geometric relationships within the data.
Our empirical evaluations, conducted across benchmark datasets and synthetic examples, demonstrate that neural networks trained with RLP loss outperform those trained with traditional loss functions.
arXiv Detail & Related papers (2023-11-21T05:22:39Z) - Regression as Classification: Influence of Task Formulation on Neural
Network Features [16.239708754973865]
Neural networks can be trained to solve regression problems by using gradient-based methods to minimize the square loss.
practitioners often prefer to reformulate regression as a classification problem, observing that training on the cross entropy loss results in better performance.
By focusing on two-layer ReLU networks, we explore how the implicit bias induced by gradient-based optimization could partly explain the phenomenon.
arXiv Detail & Related papers (2022-11-10T15:13:23Z) - Minimizing Control for Credit Assignment with Strong Feedback [65.59995261310529]
Current methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals.
We combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization.
We show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using a learning rule fully local in space and time.
arXiv Detail & Related papers (2022-04-14T22:06:21Z) - Towards Scaling Difference Target Propagation by Learning Backprop
Targets [64.90165892557776]
Difference Target Propagation is a biologically-plausible learning algorithm with close relation with Gauss-Newton (GN) optimization.
We propose a novel feedback weight training scheme that ensures both that DTP approximates BP and that layer-wise feedback weight training can be restored.
We report the best performance ever achieved by DTP on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2022-01-31T18:20:43Z) - Random Fourier Feature Based Deep Learning for Wireless Communications [18.534006003020828]
This paper analytically quantify the viability of RFF based deep-learning.
A new distribution-dependent RFF is proposed to facilitate DL architectures with low training-complexity.
In all the presented simulations, it is observed that the proposed distribution-dependent RFFs significantly outperform RFFs.
arXiv Detail & Related papers (2021-01-13T18:39:36Z) - Implicit Under-Parameterization Inhibits Data-Efficient Deep
Reinforcement Learning [97.28695683236981]
More gradient updates decrease the expressivity of the current value network.
We demonstrate this phenomenon on Atari and Gym benchmarks, in both offline and online RL settings.
arXiv Detail & Related papers (2020-10-27T17:55:16Z) - A Theoretical Framework for Target Propagation [75.52598682467817]
We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP)
Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP.
We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
arXiv Detail & Related papers (2020-06-25T12:07:06Z) - Towards Interpretable Deep Learning Models for Knowledge Tracing [62.75876617721375]
We propose to adopt the post-hoc method to tackle the interpretability issue for deep learning based knowledge tracing (DLKT) models.
Specifically, we focus on applying the layer-wise relevance propagation (LRP) method to interpret RNN-based DLKT model.
Experiment results show the feasibility using the LRP method for interpreting the DLKT model's predictions.
arXiv Detail & Related papers (2020-05-13T04:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.