Fugu-MT 論文翻訳(概要): Context Channel Capacity: An Information-Theoretic Framework for Understanding Catastrophic Forgetting

論文の概要: Context Channel Capacity: An Information-Theoretic Framework for Understanding Catastrophic Forgetting

arxiv url: http://arxiv.org/abs/2603.07415v1
Date: Sun, 08 Mar 2026 02:03:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:14.496217
Title: Context Channel Capacity: An Information-Theoretic Framework for Understanding Catastrophic Forgetting
Title（参考訳）: コンテキストチャネル容量: 破滅的フォーッティングを理解するための情報理論フレームワーク
Authors: Ran Cheng,
Abstract要約: ゼロ左折は$C_mathrmctx geq H(T)$であり、$H(T)$はタスク恒等エントロピーである。 Split-MNIST(86日で1,130以上,4種)の8 CL法でこの枠組みを検証したところ,C_mathrmctx$は忘れる行動を完全に予測できることがわかった。
参考スコア（独自算出の注目度）: 8.66871075467032
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Catastrophic forgetting remains a central challenge in continual learning (CL), yet lacks a unified information-theoretic explanation for why some architectures forget catastrophically while others do not. We introduce \emph{Context Channel Capacity} ($C_\mathrm{ctx}$), the mutual information between a CL architecture's context signal and its generated parameters, and prove that zero forgetting requires $C_\mathrm{ctx} \geq H(T)$, where $H(T)$ is the task identity entropy. We establish an \emph{Impossibility Triangle} -- zero forgetting, online learning, and finite parameters cannot be simultaneously satisfied by sequential state-based learners -- and show that conditional regeneration architectures (HyperNetworks) bypass this triangle by redefining parameters as function values rather than states. We validate this framework across 8 CL methods on Split-MNIST (1,130+ experiments over 86 days, 4 seeds each), showing that $C_\mathrm{ctx}$ perfectly predicts forgetting behavior: methods with $C_\mathrm{ctx} = 0$ (NaiveSGD, EWC, SI, LwF, CFlow) exhibit catastrophic forgetting (6--97\%), while methods with $C_\mathrm{ctx} \approx 1$ (HyperNetwork) achieve zero forgetting (98.8\% ACC). We further propose \emph{Wrong-Context Probing} (P5), a practical diagnostic protocol for measuring $C_\mathrm{ctx}$, and extend the framework to CIFAR-10 via a novel \emph{Gradient Context Encoder} that closes the oracle gap from 23.3pp to 0.7pp. A systematic taxonomy of 15+ closed research directions -- including the Hebbian null result (frozen random features outperform learned features), CFlow's $θ_0$-memorizer phenomenon, and the $S_N$ symmetry barrier to column specialization -- provides the community with precisely diagnosed negative results. Our central design principle: \emph{architecture over algorithm} -- the context pathway must be structurally unbypassable.
Abstract（参考訳）: 破滅的な忘れは、継続学習(CL)における中心的な課題であり続けているが、なぜ破滅的に忘れたのか、他のアーキテクチャが忘れていないのかについての統一的な情報理論的な説明が欠けている。 CLアーキテクチャのコンテキスト信号とその生成パラメータ間の相互情報である「emph{Context Channel Capacity}」(C_\mathrm{ctx}$)を導入し、ゼロを忘れるには$C_\mathrm{ctx} \geq H(T)$が必要であることを証明した。我々は,「emph{Impossibility Triangle} -- ゼロ忘れ,オンライン学習,有限パラメータを逐次状態ベース学習者によって同時に満たすことはできない -- を確立し,パラメータを状態ではなく関数値として再定義することで,条件付き再生アーキテクチャ(HyperNetworks)をバイパスすることを示す。 C_\mathrm{ctx}= 0$ (NaiveSGD, EWC, SI, LwF, CFlow) のメソッドは破滅的な忘れ込み(6-97 %)を示し、$C_\mathrm{ctx} \approx 1$ (HyperNetwork) のメソッドはゼロの忘れ込み(98.8 % ACC)を達成する。さらに、$C_\mathrm{ctx}$を測定するための実用的な診断プロトコルである \emph{Wrong-Context Probing} (P5) を提案し、そのフレームワークを23.3ppから0.7ppまでのオラクルギャップを閉じる新しい \emph{Gradient Context Encoder} を通じて CIFAR-10 に拡張する。 CFlowの$θ_0$-memorizer 現象、カラム特殊化に対する$S_N$対称性の障壁などを含む15以上のクローズドな研究方向の体系的な分類は、コミュニティに正確に認識されたネガティブな結果を提供する。我々の中心的な設計原則である \emph{architecture over algorithm} は、コンテキストパスを構造的にバイパスできなければならない。

論文の概要: Context Channel Capacity: An Information-Theoretic Framework for Understanding Catastrophic Forgetting

関連論文リスト