Fugu-MT 論文翻訳(概要): SpikingBrain Technical Report: Spiking Brain-inspired Large Models

論文の概要: SpikingBrain Technical Report: Spiking Brain-inspired Large Models

arxiv url: http://arxiv.org/abs/2509.05276v1
Date: Fri, 05 Sep 2025 17:34:00 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-08 14:27:25.671527
Title: SpikingBrain Technical Report: Spiking Brain-inspired Large Models
Title（参考訳）: SpikingBrainの技術レポート:脳にインスパイアされた大型モデル
Authors: Yuqi Pan, Yupeng Feng, Jinghao Zhuang, Siyu Ding, Zehao Liu, Bohan Sun, Yuhong Chou, Han Xu, Xuerui Qiu, Anlin Deng, Anjie Hu, Peng Zhou, Man Yao, Jibin Wu, Jian Yang, Guoliang Sun, Bo Xu, Guoqi Li,
Abstract要約: SpikingBrainは脳にインスパイアされたモデルの1つである。線形 LLM である SpikingBrain-7B とハイブリッド線形 MoE LLM である SpikingBrain-76B の2つのモデルを開発した。我々のモデルは、長期トレーニング効率を大幅に改善し、(部分的には)一定メモリとイベント駆動スパイクの振る舞いで推論を提供する。
参考スコア（独自算出の注目度）: 42.41339012839023
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mainstream Transformer-based large language models face major efficiency bottlenecks: training computation scales quadratically with sequence length, and inference memory grows linearly, limiting long-context processing. Building large models on non-NVIDIA platforms also poses challenges for stable and efficient training. To address this, we introduce SpikingBrain, a family of brain-inspired models designed for efficient long-context training and inference. SpikingBrain leverages the MetaX GPU cluster and focuses on three aspects: (1) Model Architecture: linear and hybrid-linear attention architectures with adaptive spiking neurons; (2) Algorithmic Optimizations: an efficient, conversion-based training pipeline and a dedicated spike coding framework; (3) System Engineering: customized training frameworks, operator libraries, and parallelism strategies tailored to MetaX hardware. Using these techniques, we develop two models: SpikingBrain-7B, a linear LLM, and SpikingBrain-76B, a hybrid-linear MoE LLM. These models demonstrate the feasibility of large-scale LLM development on non-NVIDIA platforms. SpikingBrain achieves performance comparable to open-source Transformer baselines while using only about 150B tokens for continual pre-training. Our models significantly improve long-sequence training efficiency and deliver inference with (partially) constant memory and event-driven spiking behavior. For example, SpikingBrain-7B attains over 100x speedup in Time to First Token for 4M-token sequences. Training remains stable for weeks on hundreds of MetaX C550 GPUs, with the 7B model reaching a Model FLOPs Utilization of 23.4 percent. The proposed spiking scheme achieves 69.15 percent sparsity, enabling low-power operation. Overall, this work demonstrates the potential of brain-inspired mechanisms to drive the next generation of efficient and scalable large model design.
Abstract（参考訳）: メインストリームトランスフォーマーベースの大規模言語モデルは、トレーニング計算スケールとシーケンス長の2乗スケール、推論メモリは線形に増加し、長文処理が制限されるなど、大きな効率ボトルネックに直面している。 NVIDIA以外のプラットフォームで大規模なモデルを構築することも、安定的で効率的なトレーニングの課題となる。これを解決するために,脳にインスパイアされたモデルであるSpkingBrainを紹介した。モデルアーキテクチャ: 線形およびハイブリッド線形の注意アーキテクチャと適応的なスパイキングニューロン、アルゴリズム最適化: 効率的で変換ベースのトレーニングパイプラインと専用のスパイクコーディングフレームワーク、システムエンジニアリング: カスタマイズされたトレーニングフレームワーク、オペレータライブラリ、およびMetaXハードウェアに適した並列性戦略。これらの手法を用いて,線形LEMであるSpikeBrain-7Bとハイブリッド線形MOE LLMであるSpikeBrain-76Bの2つのモデルを開発した。これらのモデルは、NVIDIA以外のプラットフォーム上での大規模LLM開発の実現可能性を示している。 SpikingBrainは、オープンソースのTransformerベースラインに匹敵するパフォーマンスを実現し、継続事前トレーニングには約150Bトークンを使用する。我々のモデルは、長期トレーニング効率を大幅に改善し、(部分的には)一定メモリとイベント駆動スパイクの振る舞いで推論を提供する。例えば、SpkingBrain-7Bは4MトークンシーケンスでTime to First Tokenの100倍以上のスピードアップを実現している。トレーニングは数百のMetaX C550 GPU上で数週間安定しており、7BモデルはモデルFLOPsの利用率23.4%に達した。提案されたスパイキング方式は69.15%のスパシティを実現し、低消費電力運転を可能にする。全体として、この研究は、次世代の効率的でスケーラブルな大規模モデル設計を推進する脳にインスパイアされたメカニズムの可能性を示している。

論文の概要: SpikingBrain Technical Report: Spiking Brain-inspired Large Models

関連論文リスト