Related papers: SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms

SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms

URL: http://arxiv.org/abs/2406.03287v1
Date: Wed, 5 Jun 2024 13:59:03 GMT
Title: SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms
Authors: Xingrun Xing, Zheng Zhang, Ziyi Ni, Shitao Xiao, Yiming Ju, Siqi Fan, Yequan Wang, Jiajun Zhang, Guoqi Li,
Abstract summary: Bio-inspired spiking neural networks (SNNs) have advantages of biological plausibility, event-driven sparsity, and binary activation. Large-scale language models exhibit promising generalization capability, making it a valuable issue to explore more general spike-driven models. This work proposes the first fully spiking mechanism for general language tasks, including both discriminative and generative ones.
Score: 30.825695629006628
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Towards energy-efficient artificial intelligence similar to the human brain, the bio-inspired spiking neural networks (SNNs) have advantages of biological plausibility, event-driven sparsity, and binary activation. Recently, large-scale language models exhibit promising generalization capability, making it a valuable issue to explore more general spike-driven models. However, the binary spikes in existing SNNs fail to encode adequate semantic information, placing technological challenges for generalization. This work proposes the first fully spiking mechanism for general language tasks, including both discriminative and generative ones. Different from previous spikes with {0,1} levels, we propose a more general spike formulation with bi-directional, elastic amplitude, and elastic frequency encoding, while still maintaining the addition nature of SNNs. In a single time step, the spike is enhanced by direction and amplitude information; in spike frequency, a strategy to control spike firing rate is well designed. We plug this elastic bi-spiking mechanism in language modeling, named SpikeLM. It is the first time to handle general language tasks with fully spike-driven models, which achieve much higher accuracy than previously possible. SpikeLM also greatly bridges the performance gap between SNNs and ANNs in language modeling. Our code is available at https://github.com/Xingrun-Xing/SpikeLM.

Related papers

SPKLIP: Aligning Spike Video Streams with Natural Language [37.640682226789934]
We introduce SPKLIP, the first architecture specifically for Spike-VLA.<n> SPKLIP employs a hierarchical spike feature extractor that adaptively models multi-scale temporal dynamics in event streams.<n> Experiments show state-of-the-art performance on benchmark spike datasets and strong few-shot generalization on a newly contributed real-world dataset.
arXiv Detail & Related papers (2025-05-19T03:14:22Z)
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free [81.65559031466452]
We conduct experiments to investigate gating-augmented softmax attention variants.<n>We find that a simple modification-applying a head-specific sigmoid gate after the Scaled Dot-Product Attention (SDPA)-consistently improves performance.
arXiv Detail & Related papers (2025-05-10T17:15:49Z)
SkipSNN: Efficiently Classifying Spike Trains with Event-attention [29.639889737632842]
Spike train classification has recently become an important topic in the machine learning community. A promising model for it should follow the design principle of performing intensive computation only when signals of interest appear. This paper introduces an event-attention mechanism that enables SNNs to dynamically highlight useful signals of the original spike trains.
arXiv Detail & Related papers (2024-10-29T03:19:25Z)
Canonic Signed Spike Coding for Efficient Spiking Neural Networks [7.524721345903027]
Spiking Neural Networks (SNNs) seek to mimic the spiking behavior of biological neurons and are expected to play a key role in the advancement of neural computing and artificial intelligence. The conversion of Artificial Neural Networks (ANNs) to SNNs is the most widely used training method, which ensures that the resulting SNNs perform comparably to ANNs on large-scale datasets. Current schemes typically use spike count or timing for encoding, which is linearly related to ANN activations and increases the required number of time steps. We propose a novel Canonic Signed Spike (CSS) coding
arXiv Detail & Related papers (2024-08-30T12:39:25Z)
SpikeVoice: High-Quality Text-to-Speech Via Efficient Spiking Neural Network [21.487450282438125]
Spiking Neural Network (SNN) has demonstrated its effectiveness and efficiency in vision, natural language, and speech understanding tasks. We design textbfSpikeVoice, which performs high-quality Text-To-Speech (TTS) via SNN, to explore the potential of SNN to "speak"
arXiv Detail & Related papers (2024-07-17T15:22:52Z)
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking [43.275370104552344]
Recent large language models (LLMs) with billions of parameters have boosted their performance across various real-world applications. Human brains exhibit significantly greater energy efficiency compared to LLMs with a similar number of parameters. We propose the first spiking large language model as recent LLMs termed SpikeLLM.
arXiv Detail & Related papers (2024-07-05T08:37:17Z)
SpeechAlign: Aligning Speech Generation to Human Preferences [51.684183257809075]
We introduce SpeechAlign, an iterative self-improvement strategy that aligns speech language models to human preferences. We show that SpeechAlign can bridge the distribution gap and facilitate continuous self-improvement of the speech language model.
arXiv Detail & Related papers (2024-04-08T15:21:17Z)
SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding [50.337896542603524]
We introduce SpikeMba: a multi-modal spiking saliency mamba for temporal video grounding. Our approach integrates Spiking Neural Networks (SNNs) with state space models (SSMs) to leverage their unique advantages. Our experiments demonstrate the effectiveness of SpikeMba, which consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-04-01T15:26:44Z)
Language Modeling on a SpiNNaker 2 Neuromorphic Chip [2.760675104404914]
Event-based networks on neuromorphic devices offer a potential way to reduce energy consumption for inference significantly. We demonstrate the first-ever implementation of a language model on a neuromorphic device.
arXiv Detail & Related papers (2023-12-14T16:16:35Z)
SpikeCLIP: A Contrastive Language-Image Pretrained Spiking Neural Network [39.54624592783459]
Spiking Neural Networks (SNNs) have emerged as a promising alternative to conventional Artificial Neural Networks (ANNs)<n>This paper presents SpikeCLIP, a novel framework designed to bridge the modality gap in spike-based computation.
arXiv Detail & Related papers (2023-10-10T09:57:17Z)
SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks [21.616328837090396]
Spiking Neural Networks (SNNs) leverage sparse and event-driven activations to reduce the computational overhead associated with model inference. We implement generative language model with binary, event-driven spiking activation units. SpikeGPT is the largest backpropagation-trained SNN model to date, rendering it suitable for both the generation and comprehension of natural language.
arXiv Detail & Related papers (2023-02-27T16:43:04Z)
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation [95.49128988683191]
Sequence-to-sequence (seq2seq) learning is a popular fashion for large-scale pretraining language models. We propose an encoding-enhanced seq2seq pretraining strategy, namely E2S2. E2S2 improves the seq2seq models via integrating more efficient self-supervised information into the encoders.
arXiv Detail & Related papers (2022-05-30T08:25:36Z)
Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble [71.97020373520922]
Sign language is commonly used by deaf or mute people to communicate. We propose a novel Multi-modal Framework with a Global Ensemble Model (GEM) for isolated Sign Language Recognition ( SLR) Our proposed SAM- SLR-v2 framework is exceedingly effective and achieves state-of-the-art performance with significant margins.
arXiv Detail & Related papers (2021-10-12T16:57:18Z)
Adversarial Training for Large Neural Language Models [107.84290922621163]
We show that adversarial pre-training can improve both generalization and robustness. ALUM regularizes the training objective by applying perturbations in the embedding space that maximizes the adversarial loss. ALUM can be further combined with task-specific fine-tuning to attain additional gains.
arXiv Detail & Related papers (2020-04-20T00:07:18Z)
Recognizing Long Grammatical Sequences Using Recurrent Networks Augmented With An External Differentiable Stack [73.48927855855219]
Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction. RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and time series forecasting problems. One way to address these shortcomings is to couple an RNN with an external, differentiable memory structure, such as a stack. In this paper, we improve the memory-augmented RNN with important architectural and state updating mechanisms.
arXiv Detail & Related papers (2020-04-04T14:19:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.