Related papers: Noise-based reward-modulated learning

Noise-based reward-modulated learning

URL: http://arxiv.org/abs/2503.23972v1
Date: Mon, 31 Mar 2025 11:35:23 GMT
Title: Noise-based reward-modulated learning
Authors: Jesús García Fernández, Nasir Ahmad, Marcel van Gerven,
Abstract summary: We propose a noise-based, biologically inspired learning rule for low-power and real-time applications.<n>Our approach combines directional derivative theory with Hebbian-like updates to enable efficient, gradient-free learning in reinforcement learning.<n>Its formulation relies on local information alone, making it compatible with implementations in neuromorphic hardware.
Score: 2.3125457626961263
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in reinforcement learning (RL) have led to significant improvements in task performance. However, training neural networks in an RL regime is typically achieved in combination with backpropagation, limiting their applicability in resource-constrained environments or when using non-differentiable neural networks. While noise-based alternatives like reward-modulated Hebbian learning (RMHL) have been proposed, their performance has remained limited, especially in scenarios with delayed rewards, which require retrospective credit assignment over time. Here, we derive a novel noise-based learning rule that addresses these challenges. Our approach combines directional derivative theory with Hebbian-like updates to enable efficient, gradient-free learning in RL. It features stochastic noisy neurons which can approximate gradients, and produces local synaptic updates modulated by a global reward signal. Drawing on concepts from neuroscience, our method uses reward prediction error as its optimization target to generate increasingly advantageous behavior, and incorporates an eligibility trace to facilitate temporal credit assignment in environments with delayed rewards. Its formulation relies on local information alone, making it compatible with implementations in neuromorphic hardware. Experimental validation shows that our approach significantly outperforms RMHL and is competitive with BP-based baselines, highlighting the promise of noise-based, biologically inspired learning for low-power and real-time applications.

Related papers

Learning to Control Dynamical Agents via Spiking Neural Networks and Metropolis-Hastings Sampling [1.0533738606966752]
Spiking Neural Networks (SNNs) offer biologically inspired, energy-efficient alternatives to traditional Deep Neural Networks (DNNs) for real-time control systems.<n>We introduce what is, to our knowledge, the first framework that employs Metropolis-Hastings sampling, a Bayesian inference technique, to train SNNs for dynamical agent control in RL environments.
arXiv Detail & Related papers (2025-07-13T08:50:00Z)
Fast Training of Recurrent Neural Networks with Stationary State Feedbacks [48.22082789438538]
Recurrent neural networks (RNNs) have recently demonstrated strong performance and faster inference than Transformers. We propose a novel method that replaces BPTT with a fixed gradient feedback mechanism.
arXiv Detail & Related papers (2025-03-29T14:45:52Z)
Randomized Forward Mode Gradient for Spiking Neural Networks in Scientific Machine Learning [4.178826560825283]
Spiking neural networks (SNNs) represent a promising approach in machine learning, combining the hierarchical learning capabilities of deep neural networks with the energy efficiency of spike-based computations. Traditional end-to-end training of SNNs is often based on back-propagation, where weight updates are derived from gradients computed through the chain rule. This method encounters challenges due to its limited biological plausibility and inefficiencies on neuromorphic hardware. In this study, we introduce an alternative training approach for SNNs. Instead of using back-propagation, we leverage weight perturbation methods within a forward-mode
arXiv Detail & Related papers (2024-11-11T15:20:54Z)
Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
Deep operator network (DeepNet) has demonstrated success in the real-time prediction of complex dynamics across various scientific and engineering applications. We introduce a random sampling technique to be adopted the training of DeepONet, aimed at improving generalization ability of the model, while significantly reducing computational time. Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet, offering a promising avenue for improving the framework's performance in modeling complex physical systems.
arXiv Detail & Related papers (2024-09-20T07:18:31Z)
Gradient-Free Training of Recurrent Neural Networks using Random Perturbations [1.1742364055094265]
Recurrent neural networks (RNNs) hold immense potential for computations due to their Turing completeness and sequential processing capabilities. Backpropagation through time (BPTT), the prevailing method, extends the backpropagation algorithm by unrolling the RNN over time. BPTT suffers from significant drawbacks, including the need to interleave forward and backward phases and store exact gradient information. We present a new approach to perturbation-based learning in RNNs whose performance is competitive with BPTT.
arXiv Detail & Related papers (2024-05-14T21:15:29Z)
Real-Time Recurrent Reinforcement Learning [7.737685867200335]
We introduce a biologically plausible RL framework for solving tasks in partially observable Markov decision processes (POMDPs)<n>The proposed algorithm combines three integral parts: (1) A Meta-RL architecture, resembling the mammalian basal ganglia; (2) A biologically plausible reinforcement learning algorithm, exploiting temporal difference learning and eligibility traces to train the policy and the value-function; and (3) An online automatic differentiation algorithm for computing the gradients with respect to parameters of a shared recurrent network backbone.
arXiv Detail & Related papers (2023-11-08T16:56:16Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task.<n>Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Properties and Potential Applications of Random Functional-Linked Types of Neural Networks [81.56822938033119]
Random functional-linked neural networks (RFLNNs) offer an alternative way of learning in deep structure. This paper gives some insights into the properties of RFLNNs from the viewpoints of frequency domain. We propose a method to generate a BLS network with better performance, and design an efficient algorithm for solving Poison's equation.
arXiv Detail & Related papers (2023-04-03T13:25:22Z)
Minimizing Control for Credit Assignment with Strong Feedback [65.59995261310529]
Current methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals. We combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization. We show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using a learning rule fully local in space and time.
arXiv Detail & Related papers (2022-04-14T22:06:21Z)
Imitating, Fast and Slow: Robust learning from demonstrations via decision-time planning [96.72185761508668]
Planning at Test-time (IMPLANT) is a new meta-algorithm for imitation learning. We demonstrate that IMPLANT significantly outperforms benchmark imitation learning approaches on standard control environments.
arXiv Detail & Related papers (2022-04-07T17:16:52Z)
Improving Music Performance Assessment with Contrastive Learning [78.8942067357231]
This study investigates contrastive learning as a potential method to improve existing MPA systems. We introduce a weighted contrastive loss suitable for regression tasks applied to a convolutional neural network. Our results show that contrastive-based methods are able to match and exceed SoTA performance for MPA regression tasks.
arXiv Detail & Related papers (2021-08-03T19:24:25Z)
Analytically Tractable Bayesian Deep Q-Learning [0.0]
We adapt the temporal difference Q-learning framework to make it compatible with the tractable approximate Gaussian inference (TAGI) We demonstrate that TAGI can reach a performance comparable to backpropagation-trained networks.
arXiv Detail & Related papers (2021-06-21T13:11:52Z)
Rectified Linear Postsynaptic Potential Function for Backpropagation in Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation. This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.