Fugu-MT 論文翻訳(概要): ActTail: Global Activation Sparsity in Large Language Models

論文の概要: ActTail: Global Activation Sparsity in Large Language Models

arxiv url: http://arxiv.org/abs/2603.12272v1
Date: Wed, 18 Feb 2026 14:46:03 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:42.195731
Title: ActTail: Global Activation Sparsity in Large Language Models
Title（参考訳）: ActTail: 大規模言語モデルにおけるグローバルなアクティベーション空間
Authors: Wenwen Hou, Xinyuan Song, Shiwei Liu,
Abstract要約: 本稿では,ヘビータイド自己規則化(HT-SR)理論に基づく,グローバルなアクティベーション・スパシティ割り当てを備えたTopK等級に基づくアクティベーション・スパシティ手法であるActTailを提案する。提案手法は,一様割当よりも高間隔で,複雑度と下流タスク性能を両立させることを示す。
参考スコア（独自算出の注目度）: 14.120733678728191
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Activation sparsity is a promising approach for accelerating large language model (LLM) inference by reducing computation and memory movement. However, existing activation sparsity methods typically apply uniform sparsity across projections, ignoring the heterogeneous statistical properties of Transformer weights and thereby amplifying performance degradation. In this paper, we propose ActTail, a TopK magnitude-based activation sparsity method with global activation sparsity allocation grounded in Heavy-Tailed Self-Regularization (HT-SR) theory. Specifically, we capture this heterogeneity via the heavy-tail exponent computed from each projection's empirical spectral density (ESD), which is used as a quantitative indicator to assign projection-specific sparsity budgets. Importantly, we provide a theoretical analysis that establishes an explicit relationship between the activation sparsity ratio and the heavy-tail exponent under the HT-SR regime, offering principled guidance for sparsity allocation beyond heuristic design. Experiments on LLaMA and Mistral models show that our method improves both perplexity and downstream task performance at high sparsity compared to uniform allocation. At 80% sparsity, perplexity is reduced by 21.8% on LLaMA-2-7B, 40.1% on LLaMA-2-13B, and 9.4% on Mistral-7B.
Abstract（参考訳）: アクティベーション・スパシティは、計算とメモリの移動を減らし、大規模言語モデル(LLM)推論を加速するための有望なアプローチである。しかし、既存のアクティベーション・スパシティ法は一般的に射影全体の均一な間隔を適用でき、トランスフォーマーの重みの不均一な統計的性質を無視して性能劣化を増幅する。本稿では,ヘビータイド自己規則化(HT-SR)理論を基礎とした,グローバルなアクティベーション・スパシティ割り当てを備えたTopK等級に基づくアクティベーション・スパシティ手法であるActTailを提案する。具体的には、各射影の実験的スペクトル密度(ESD)から計算された重尾指数を用いて、この不均一性を捉える。重要なことは、HT-SR体制下での活性化時空間比と重テール指数との明確な関係を確立し、ヒューリスティックな設計以上の空間割当の原則的ガイダンスを提供する理論分析である。 LLaMAモデルとMistralモデルを用いた実験により,本手法は一様アロケーションよりも高間隔でのパープレキシティとダウンストリームタスク性能を向上することが示された。 80%の間隔で、LLaMA-2-7Bでは21.8%、LLaMA-2-13Bでは40.1%、Mistral-7Bでは9.4%減少する。

論文の概要: ActTail: Global Activation Sparsity in Large Language Models

関連論文リスト