Training-Free ANN-to-SNN Conversion for High-Performance Spiking Transformer
- URL: http://arxiv.org/abs/2508.07710v1
- Date: Mon, 11 Aug 2025 07:38:32 GMT
- Title: Training-Free ANN-to-SNN Conversion for High-Performance Spiking Transformer
- Authors: Jingya Wang, Xin Deng, Wenjie Wei, Dehao Zhang, Shuai Wang, Qian Sun, Jieyuan Zhang, Hanwen Liu, Ning Xie, Malu Zhang,
- Abstract summary: Spiking Neural Networks (SNNs) offer a promising approach for constructing energy-efficient Transformer architectures.<n>We propose a high-performance and training-free ANN-to-SNN conversion framework tailored for Transformer architectures.<n>Our method achieves near-lossless conversion accuracy with significantly lower latency.
- Score: 25.25113597897599
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Leveraging the event-driven paradigm, Spiking Neural Networks (SNNs) offer a promising approach for constructing energy-efficient Transformer architectures. Compared to directly trained Spiking Transformers, ANN-to-SNN conversion methods bypass the high training costs. However, existing methods still suffer from notable limitations, failing to effectively handle nonlinear operations in Transformer architectures and requiring additional fine-tuning processes for pre-trained ANNs. To address these issues, we propose a high-performance and training-free ANN-to-SNN conversion framework tailored for Transformer architectures. Specifically, we introduce a Multi-basis Exponential Decay (MBE) neuron, which employs an exponential decay strategy and multi-basis encoding method to efficiently approximate various nonlinear operations. It removes the requirement for weight modifications in pre-trained ANNs. Extensive experiments across diverse tasks (CV, NLU, NLG) and mainstream Transformer architectures (ViT, RoBERTa, GPT-2) demonstrate that our method achieves near-lossless conversion accuracy with significantly lower latency. This provides a promising pathway for the efficient and scalable deployment of Spiking Transformers in real-world applications.
Related papers
- BHViT: Binarized Hybrid Vision Transformer [53.38894971164072]
Model binarization has made significant progress in enabling real-time and energy-efficient computation for convolutional neural networks (CNN)<n>We propose BHViT, a binarization-friendly hybrid ViT architecture and its full binarization model with the guidance of three important observations.<n>Our proposed algorithm achieves SOTA performance among binary ViT methods.
arXiv Detail & Related papers (2025-03-04T08:35:01Z) - Differential Coding for Training-Free ANN-to-SNN Conversion [45.70141988713627]
Spiking Neural Networks (SNNs) exhibit significant potential due to their low energy consumption.<n> converting Artificial Neural Networks (ANNs) to SNNs is an efficient way to achieve high-performance SNNs.<n>This article introduces differential coding for ANN-to-SNN conversion, a novel coding scheme that reduces spike counts and energy consumption.
arXiv Detail & Related papers (2025-03-01T02:17:35Z) - Spiking Transformer:Introducing Accurate Addition-Only Spiking Self-Attention for Transformer [15.93436166506258]
Spiking Neural Networks have emerged as a promising energy-efficient alternative to traditional Artificial Neural Networks.<n>This paper introduces Accurate Addition-Only Spiking Self-Attention (A$2$OS$2$A)
arXiv Detail & Related papers (2025-02-28T22:23:29Z) - Towards High-performance Spiking Transformers from ANN to SNN Conversion [43.53538629484375]
Spiking neural networks (SNNs) show great potential due to their energy efficiency, fast processing capabilities, and robustness.<n>Current conversion methods mainly focus on converting convolutional neural networks (CNNs) to SNNs.<n>In this paper, we propose an Expectation Compensation Module to preserve accuracy of the conversion.
arXiv Detail & Related papers (2025-02-28T16:12:37Z) - Faster and Stronger: When ANN-SNN Conversion Meets Parallel Spiking Calculation [45.58270096818676]
Spiking Neural Network (SNN), as a brain-inspired and energy-efficient network, is facing the pivotal challenge of exploring a suitable learning framework.<n>We propose a novel parallel conversion learning framework, which establishes a mathematical mapping relationship between each time-step of the parallel spiking neurons.<n>Experiments have confirmed the significant performance advantages of our method for various conversion cases under ultra-low time latency.
arXiv Detail & Related papers (2024-12-18T08:37:13Z) - AutoST: Training-free Neural Architecture Search for Spiking
Transformers [14.791412391584064]
Spiking Transformers achieve both the energy efficiency of Spiking Neural Networks (SNNs) and the high capacity of Transformers.
Existing Spiking Transformer architectures exhibit a notable architectural gap, resulting in suboptimal performance.
We introduce AutoST, a training-free NAS method for Spiking Transformers, to rapidly identify high-performance Spiking Transformer architectures.
arXiv Detail & Related papers (2023-07-01T10:19:52Z) - RWKV: Reinventing RNNs for the Transformer Era [54.716108899349614]
We propose a novel model architecture that combines the efficient parallelizable training of transformers with the efficient inference of RNNs.
We scale our models as large as 14 billion parameters, by far the largest dense RNN ever trained, and find RWKV performs on par with similarly sized Transformers.
arXiv Detail & Related papers (2023-05-22T13:57:41Z) - Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications.
The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate.
There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z) - Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks.
We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers.
Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z) - Finetuning Pretrained Transformers into RNNs [81.72974646901136]
Transformers have outperformed recurrent neural networks (RNNs) in natural language generation.
A linear-complexity recurrent variant has proven well suited for autoregressive generation.
This work aims to convert a pretrained transformer into its efficient recurrent counterpart.
arXiv Detail & Related papers (2021-03-24T10:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.