Related papers: Learning to Control Dynamical Agents via Spiking Neural Networks and Metropolis-Hastings Sampling

Learning to Control Dynamical Agents via Spiking Neural Networks and Metropolis-Hastings Sampling

URL: http://arxiv.org/abs/2507.09540v1
Date: Sun, 13 Jul 2025 08:50:00 GMT
Title: Learning to Control Dynamical Agents via Spiking Neural Networks and Metropolis-Hastings Sampling
Authors: Ali Safa, Farida Mohsen, Ali Al-Zawqari,
Abstract summary: Spiking Neural Networks (SNNs) offer biologically inspired, energy-efficient alternatives to traditional Deep Neural Networks (DNNs) for real-time control systems.<n>We introduce what is, to our knowledge, the first framework that employs Metropolis-Hastings sampling, a Bayesian inference technique, to train SNNs for dynamical agent control in RL environments.
Score: 1.0533738606966752
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spiking Neural Networks (SNNs) offer biologically inspired, energy-efficient alternatives to traditional Deep Neural Networks (DNNs) for real-time control systems. However, their training presents several challenges, particularly for reinforcement learning (RL) tasks, due to the non-differentiable nature of spike-based communication. In this work, we introduce what is, to our knowledge, the first framework that employs Metropolis-Hastings (MH) sampling, a Bayesian inference technique, to train SNNs for dynamical agent control in RL environments without relying on gradient-based methods. Our approach iteratively proposes and probabilistically accepts network parameter updates based on accumulated reward signals, effectively circumventing the limitations of backpropagation while enabling direct optimization on neuromorphic platforms. We evaluated this framework on two standard control benchmarks: AcroBot and CartPole. The results demonstrate that our MH-based approach outperforms conventional Deep Q-Learning (DQL) baselines and prior SNN-based RL approaches in terms of maximizing the accumulated reward while minimizing network resources and training episodes.

Related papers

SpikeRL: A Scalable and Energy-efficient Framework for Deep Spiking Reinforcement Learning [1.6999370482438731]
SpikeRL is a scalable and energy efficient framework for DeepRL-based SNNs for continuous control.<n>Our new SpikeRL implementation is 4.26X faster and 2.25X more energy efficient than state-of-the-art DeepRL-SNN methods.
arXiv Detail & Related papers (2025-02-21T05:28:42Z)
Joint Admission Control and Resource Allocation of Virtual Network Embedding via Hierarchical Deep Reinforcement Learning [69.00997996453842]
We propose a deep Reinforcement Learning approach to learn a joint Admission Control and Resource Allocation policy for virtual network embedding. We show that HRL-ACRA outperforms state-of-the-art baselines in terms of both the acceptance ratio and long-term average revenue.
arXiv Detail & Related papers (2024-06-25T07:42:30Z)
Properties and Potential Applications of Random Functional-Linked Types of Neural Networks [81.56822938033119]
Random functional-linked neural networks (RFLNNs) offer an alternative way of learning in deep structure. This paper gives some insights into the properties of RFLNNs from the viewpoints of frequency domain. We propose a method to generate a BLS network with better performance, and design an efficient algorithm for solving Poison's equation.
arXiv Detail & Related papers (2023-04-03T13:25:22Z)
POLAR-Express: Efficient and Precise Formal Reachability Analysis of Neural-Network Controlled Systems [18.369115196505657]
We present POLAR-Express, an efficient and precise formal reachability analysis tool for verifying the safety of neural-network controlled systems (NNCSs) POLAR-Express uses Taylor model arithmetic to propagate Taylor models across a neural network layer-by-layer to compute an overapproximation of the neural-network function. We also present a novel approach to propagate TMs more efficiently and precisely across ReLU activation functions.
arXiv Detail & Related papers (2023-03-31T06:51:36Z)
An Unsupervised STDP-based Spiking Neural Network Inspired By Biologically Plausible Learning Rules and Connections [10.188771327458651]
Spike-timing-dependent plasticity (STDP) is a general learning rule in the brain, but spiking neural networks (SNNs) trained with STDP alone is inefficient and perform poorly. We design an adaptive synaptic filter and introduce the adaptive spiking threshold to enrich the representation ability of SNNs. Our model achieves the current state-of-the-art performance of unsupervised STDP-based SNNs in the MNIST and FashionMNIST datasets.
arXiv Detail & Related papers (2022-07-06T14:53:32Z)
Spiking Generative Adversarial Networks With a Neural Network Discriminator: Local Training, Bayesian Models, and Continual Meta-Learning [31.78005607111787]
Training neural networks to reproduce spiking patterns is a central problem in neuromorphic computing. This work proposes to train SNNs so as to match spiking signals rather than individual spiking signals.
arXiv Detail & Related papers (2021-11-02T17:20:54Z)
Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes. Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z)
Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs) We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs. We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z)
An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d) This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z)
Rectified Linear Postsynaptic Potential Function for Backpropagation in Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation. This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
Indirect and Direct Training of Spiking Neural Networks for End-to-End Control of a Lane-Keeping Vehicle [12.137685936113384]
Building spiking neural networks (SNNs) based on biological synaptic plasticities holds a promising potential for accomplishing fast and energy-efficient computing. In this paper, we introduce both indirect and direct end-to-end training methods of SNNs for a lane-keeping vehicle.
arXiv Detail & Related papers (2020-03-10T09:35:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.