Binary Event-Driven Spiking Transformer
- URL: http://arxiv.org/abs/2501.05904v1
- Date: Fri, 10 Jan 2025 12:00:11 GMT
- Title: Binary Event-Driven Spiking Transformer
- Authors: Honglin Cao, Zijian Zhou, Wenjie Wei, Ammar Belatreche, Yu Liang, Dehao Zhang, Malu Zhang, Yang Yang, Haizhou Li,
- Abstract summary: Transformer-based Spiking Neural Networks (SNNs) introduce a novel event-driven self-attention paradigm.
We propose the Binary Event-Driven Spiking Transformer, i.e. BESTformer.
BESTformer suffers from a severe performance drop from its full-precision counterpart due to the limited representation capability of binarization.
- Score: 36.815359983551986
- License:
- Abstract: Transformer-based Spiking Neural Networks (SNNs) introduce a novel event-driven self-attention paradigm that combines the high performance of Transformers with the energy efficiency of SNNs. However, the larger model size and increased computational demands of the Transformer structure limit their practicality in resource-constrained scenarios. In this paper, we integrate binarization techniques into Transformer-based SNNs and propose the Binary Event-Driven Spiking Transformer, i.e. BESTformer. The proposed BESTformer can significantly reduce storage and computational demands by representing weights and attention maps with a mere 1-bit. However, BESTformer suffers from a severe performance drop from its full-precision counterpart due to the limited representation capability of binarization. To address this issue, we propose a Coupled Information Enhancement (CIE) method, which consists of a reversible framework and information enhancement distillation. By maximizing the mutual information between the binary model and its full-precision counterpart, the CIE method effectively mitigates the performance degradation of the BESTformer. Extensive experiments on static and neuromorphic datasets demonstrate that our method achieves superior performance to other binary SNNs, showcasing its potential as a compact yet high-performance model for resource-limited edge devices.
Related papers
- Combining Aggregated Attention and Transformer Architecture for Accurate and Efficient Performance of Spiking Neural Networks [44.145870290310356]
Spiking Neural Networks have attracted significant attention in recent years due to their distinctive low-power characteristics.
Transformers models, known for their powerful self-attention mechanisms and parallel processing capabilities, have demonstrated exceptional performance across various domains.
Despite the significant advantages of both SNNs and Transformers, directly combining the low-power benefits of SNNs with the high performance of Transformers remains challenging.
arXiv Detail & Related papers (2024-12-18T07:07:38Z) - Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition [10.302458835329539]
We introduce a new method, namely Transformer Re- parameterization, to boost the performance of lightweight Transformer models.
Experimental results show that our proposed method consistently improves the performance of lightweight Transformers, even making them comparable to large models.
arXiv Detail & Related papers (2024-11-14T10:36:19Z) - ARTEMIS: A Mixed Analog-Stochastic In-DRAM Accelerator for Transformer Neural Networks [2.9699290794642366]
ARTEMIS is a mixed analog-stochastic in-DRAM accelerator for transformer models.
Our analysis indicates that ARTEMIS exhibits at least 3.0x speedup, 1.8x lower energy, and 1.9x better energy efficiency compared to GPU, TPU, CPU, and state-of-the-art PIM transformer hardware accelerators.
arXiv Detail & Related papers (2024-07-17T15:08:14Z) - AutoST: Training-free Neural Architecture Search for Spiking
Transformers [14.791412391584064]
Spiking Transformers achieve both the energy efficiency of Spiking Neural Networks (SNNs) and the high capacity of Transformers.
Existing Spiking Transformer architectures exhibit a notable architectural gap, resulting in suboptimal performance.
We introduce AutoST, a training-free NAS method for Spiking Transformers, to rapidly identify high-performance Spiking Transformer architectures.
arXiv Detail & Related papers (2023-07-01T10:19:52Z) - GSB: Group Superposition Binarization for Vision Transformer with
Limited Training Samples [46.025105938192624]
Vision Transformer (ViT) has performed remarkably in various computer vision tasks.
ViT usually suffers from serious overfitting problems with a relatively limited number of training samples.
We propose a novel model binarization technique, called Group Superposition Binarization (GSB)
arXiv Detail & Related papers (2023-05-13T14:48:09Z) - HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer
Compression [69.36555801766762]
We propose a hardware-aware tensor decomposition framework, dubbed HEAT, that enables efficient exploration of the exponential space of possible decompositions.
We experimentally show that our hardware-aware factorized BERT variants reduce the energy-delay product by 5.7x with less than 1.1% accuracy loss.
arXiv Detail & Related papers (2022-11-30T05:31:45Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Distribution-sensitive Information Retention for Accurate Binary Neural
Network [49.971345958676196]
We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients.
Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures.
We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
arXiv Detail & Related papers (2021-09-25T10:59:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.