Related papers: Decision SpikeFormer: Spike-Driven Transformer for Decision Making

Decision SpikeFormer: Spike-Driven Transformer for Decision Making

URL: http://arxiv.org/abs/2504.03800v1
Date: Fri, 04 Apr 2025 07:42:36 GMT
Title: Decision SpikeFormer: Spike-Driven Transformer for Decision Making
Authors: Wei Huang, Qinying Gu, Nanyang Ye,
Abstract summary: offline reinforcement learning (RL) enables policy training solely on pre-collected data, avoiding direct environment interaction.<n>We introduce DSFormer, the first spike-driven transformer model designed to tackle offline RL via sequence modeling.<n> Comprehensive results in the D4RL benchmark show DSFormer's superiority over both SNN and ANN counterparts, achieving 78.4% energy savings.
Score: 11.652964678824382
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Offline reinforcement learning (RL) enables policy training solely on pre-collected data, avoiding direct environment interaction - a crucial benefit for energy-constrained embodied AI applications. Although Artificial Neural Networks (ANN)-based methods perform well in offline RL, their high computational and energy demands motivate exploration of more efficient alternatives. Spiking Neural Networks (SNNs) show promise for such tasks, given their low power consumption. In this work, we introduce DSFormer, the first spike-driven transformer model designed to tackle offline RL via sequence modeling. Unlike existing SNN transformers focused on spatial dimensions for vision tasks, we develop Temporal Spiking Self-Attention (TSSA) and Positional Spiking Self-Attention (PSSA) in DSFormer to capture the temporal and positional dependencies essential for sequence modeling in RL. Additionally, we propose Progressive Threshold-dependent Batch Normalization (PTBN), which combines the benefits of LayerNorm and BatchNorm to preserve temporal dependencies while maintaining the spiking nature of SNNs. Comprehensive results in the D4RL benchmark show DSFormer's superiority over both SNN and ANN counterparts, achieving 78.4% energy savings, highlighting DSFormer's advantages not only in energy efficiency but also in competitive performance. Code and models are public at https://wei-nijuan.github.io/DecisionSpikeFormer.

Related papers

Learning to Control Dynamical Agents via Spiking Neural Networks and Metropolis-Hastings Sampling [1.0533738606966752]
Spiking Neural Networks (SNNs) offer biologically inspired, energy-efficient alternatives to traditional Deep Neural Networks (DNNs) for real-time control systems.<n>We introduce what is, to our knowledge, the first framework that employs Metropolis-Hastings sampling, a Bayesian inference technique, to train SNNs for dynamical agent control in RL environments.
arXiv Detail & Related papers (2025-07-13T08:50:00Z)
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control [26.105497272647977]
Spiking Neural Networks (SNNs) offer low-latency and energy-efficient decision making through neuromorphic hardware.<n>Recent studies overlook whether Reinforcement Learning (RL) algorithms are suitable for SNNs.<n>We propose a novel proxy target framework to bridge the gap between discrete SNN and continuous control.
arXiv Detail & Related papers (2025-05-30T03:08:03Z)
SpikeRL: A Scalable and Energy-efficient Framework for Deep Spiking Reinforcement Learning [1.6999370482438731]
SpikeRL is a scalable and energy efficient framework for DeepRL-based SNNs for continuous control.<n>Our new SpikeRL implementation is 4.26X faster and 2.25X more energy efficient than state-of-the-art DeepRL-SNN methods.
arXiv Detail & Related papers (2025-02-21T05:28:42Z)
DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing. Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time. We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z)
Compressing Deep Reinforcement Learning Networks with a Dynamic Structured Pruning Method for Autonomous Driving [63.155562267383864]
Deep reinforcement learning (DRL) has shown remarkable success in complex autonomous driving scenarios. DRL models inevitably bring high memory consumption and computation, which hinders their wide deployment in resource-limited autonomous driving devices. We introduce a novel dynamic structured pruning approach that gradually removes a DRL model's unimportant neurons during the training stage.
arXiv Detail & Related papers (2024-02-07T09:00:30Z)
Fully Spiking Actor Network with Intra-layer Connections for Reinforcement Learning [51.386945803485084]
We focus on the task where the agent needs to learn multi-dimensional deterministic policies to control. Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected layer. To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects.
arXiv Detail & Related papers (2024-01-09T07:31:34Z)
Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors. Our work is the first attempt to optimize BNNs from the bilinear perspective. We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z)
Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. It is a challenge to efficiently train SNNs due to their non-differentiability. We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z)
A Spiking Neural Network Structure Implementing Reinforcement Learning [0.0]
In the present paper, I describe an SNN structure which, seemingly, can be used in wide range of reinforcement learning tasks. The SNN structure considered in the paper includes spiking neurons described by a generalization of the LIFAT (leaky integrate-and-fire neuron with adaptive threshold) model. My concept is based on very general assumptions about RL task characteristics and has no visible limitations on its applicability.
arXiv Detail & Related papers (2022-04-09T09:08:10Z)
Deep Reinforcement Learning with Spiking Q-learning [51.386945803485084]
spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption. It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (RL)
arXiv Detail & Related papers (2022-01-21T16:42:11Z)
Federated Deep Reinforcement Learning for the Distributed Control of NextG Wireless Networks [16.12495409295754]
Next Generation (NextG) networks are expected to support demanding internet tactile applications such as augmented reality and connected autonomous vehicles. Data-driven approaches can improve the ability of the network to adapt to the current operating conditions. Deep RL (DRL) has been shown to achieve good performance even in complex environments.
arXiv Detail & Related papers (2021-12-07T03:13:20Z)
Deep Reinforcement Learning with Population-Coded Spiking Neural Network for Continuous Control [0.0]
We propose a population-coded spiking actor network (PopSAN) trained in conjunction with a deep critic network using deep reinforcement learning (DRL) We deployed the trained PopSAN on Intel's Loihi neuromorphic chip and benchmarked our method against the mainstream DRL algorithms for continuous control. Our results support the efficiency of neuromorphic controllers and suggest our hybrid RL as an alternative to deep learning, when both energy-efficiency and robustness are important.
arXiv Detail & Related papers (2020-10-19T16:20:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.