Fugu-MT 論文翻訳(概要): Towards Explainability of SLMs by investigating Token Level Activation

論文の概要: Towards Explainability of SLMs by investigating Token Level Activation

arxiv url: http://arxiv.org/abs/2605.22377v1
Date: Thu, 21 May 2026 12:09:12 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-22 16:35:42.245829
Title: Towards Explainability of SLMs by investigating Token Level Activation
Title（参考訳）: トークンレベルアクティベーションによるSLMの説明可能性向上に向けて
Authors: Sayantani Ghosh, Rajashik Datta, Amit Kumar Das, Amlan Chakrabarti,
Abstract要約: 本研究では,BERTのレイヤ8における隠れ状態活性化強度を用いて,トークンレベルの表現重要性を定量化するフレームワークを提案する。提案する Activation Flow Network (AFN) フレームワークは,レイヤ8隠れ表現のL2ノルムを用いて,Token Activation Strengthを演算する。さらに、トークンを高活性化群と低活性化群に分割するしきい値に基づくアクティベーションバケットの定式化を導入する。
参考スコア（独自算出の注目度）: 2.7590644934977764
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformer-based language models such as BERT having 110M+ parameters have revolutionized natural language understanding, yet their internal mechanisms remain largely opaque to researchers and practitioners. Traditional attention-based interpretability methods often emphasize structurally important but semantically weak tokens such as punctuation marks rather than meaningful semantic relationships. This work introduces a lightweight and model-agnostic framework for quantifying token-level representational importance using hidden-state activation strengths at Layer 8 of BERT. The proposed Activation Flow Network (AFN) framework computes Token Activation Strength using the L2 norm of Layer-8 hidden representations, enabling direct ranking of semantically salient tokens. The study further introduces a threshold-based activation bucket formulation that partitions tokens into HIGH-activation and LOW-activation groups using an empirical upper-quartile activation boundary. Experimental observations demonstrate that semantically meaningful content words consistently occupy the HIGH-activation bucket and dominate representational activation shifts, while structurally supportive tokens contribute comparatively less. The results suggest that Layer 8 acts as a critical semantic consolidation zone balancing structural and semantic information processing. By revealing how activation magnitudes concentrate around semantically informative tokens, this work provides an interpretable and computationally efficient alternative to attentioncentric analysis, contributing toward transforming BERT from a "black box" into a more transparent "glass box" model for natural language understanding.
Abstract（参考訳）: 110M以上のパラメータを持つBERTのようなトランスフォーマーベースの言語モデルは、自然言語理解に革命をもたらしたが、その内部メカニズムは研究者や実践者にはほとんど不透明である。伝統的な注意に基づく解釈可能性法は、意味的な意味的関係ではなく、句読点のような構造的に重要だが意味的に弱いトークンを強調することが多い。本研究では,BERTのレイヤ8における隠れ状態活性化強度を用いて,トークンレベルの表現の重要性を定量化する,軽量でモデルに依存しないフレームワークを提案する。提案した Activation Flow Network (AFN) フレームワークは,Layer-8 の隠蔽表現の L2 ノルムを用いて,Token Activation Strength を計算し,意味的に有意なトークンの直接ランク付けを可能にする。さらに, トークンを高活性化群と低活性化群に分割するしきい値に基づくアクティベーションバケットの定式化について, 経験的上方アクティベーションバケット境界を用いた検討を行った。実験により、意味的に意味のある内容語は、高い活性化バケットを一貫して占有し、表現的アクティベーションシフトを支配し、一方、構造的に支持されるトークンは比較的少ないことが示されている。その結果,レイヤ8は構造情報処理と意味情報処理のバランスをとる重要なセマンティック・コンソリデーションゾーンとして機能することが示唆された。アクティベーションの規模が意味的に情報的トークンにどのように集中するかを明らかにすることで、この研究は、注意中心の分析の解釈可能で計算的に効率的な代替手段を提供し、BERTを「ブラックボックス」からより透明な「グラスボックス」モデルに変換し、自然言語理解に寄与する。

論文の概要: Towards Explainability of SLMs by investigating Token Level Activation

関連論文リスト