Fugu-MT 論文翻訳(概要): Diagnosing Shortcut-Induced Rigidity in Continual Learning: The Einstellung Rigidity Index (ERI)

論文の概要: Diagnosing Shortcut-Induced Rigidity in Continual Learning: The Einstellung Rigidity Index (ERI)

arxiv url: http://arxiv.org/abs/2510.00475v1
Date: Wed, 01 Oct 2025 03:52:40 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-03 16:59:20.359439
Title: Diagnosing Shortcut-Induced Rigidity in Continual Learning: The Einstellung Rigidity Index (ERI)
Title（参考訳）: 連続学習におけるショートカットによる剛性診断:Einstellung Rigidity Index (ERI)
Authors: Kai Gu, Weishi Shi,
Abstract要約: ショートカット機能は、分散シフト時の堅牢性を損なうとともに、信頼性を低下させる。連続学習(CL)では、ショートカットによる搾取の結果が持続し、強化される。 CLでは、ショートカットによって引き起こされる剛性は、新規なものの獲得を阻害する。
参考スコア（独自算出の注目度）: 7.587193411022608
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks frequently exploit shortcut features, defined as incidental correlations between inputs and labels without causal meaning. Shortcut features undermine robustness and reduce reliability under distribution shifts. In continual learning (CL), the consequences of shortcut exploitation can persist and intensify: weights inherited from earlier tasks bias representation reuse toward whatever features most easily satisfied prior labels, mirroring the cognitive Einstellung effect, a phenomenon where past habits block optimal solutions. Whereas catastrophic forgetting erodes past skills, shortcut-induced rigidity throttles the acquisition of new ones. We introduce the Einstellung Rigidity Index (ERI), a compact diagnostic that disentangles genuine transfer from cue-inflated performance using three interpretable facets: (i) Adaptation Delay (AD), (ii) Performance Deficit (PD), and (iii) Relative Suboptimal Feature Reliance (SFR_rel). On a two-phase CIFAR-100 CL benchmark with a deliberately spurious magenta patch in Phase 2, we evaluate Naive fine-tuning (SGD), online Elastic Weight Consolidation (EWC_on), Dark Experience Replay (DER++), Gradient Projection Memory (GPM), and Deep Generative Replay (DGR). Across these continual learning methods, we observe that CL methods reach accuracy thresholds earlier than a Scratch-T2 baseline (negative AD) but achieve slightly lower final accuracy on patched shortcut classes (positive PD). Masking the patch improves accuracy for CL methods while slightly reducing Scratch-T2, yielding negative SFR_rel. This pattern indicates the patch acted as a distractor for CL models in this setting rather than a helpful shortcut.
Abstract（参考訳）: ディープニューラルネットワークは、インプットとラベルの因果的意味のないインシデント相関として定義されるショートカット機能を利用することが多い。ショートカット機能は、分散シフト時の堅牢性を損なうとともに、信頼性を低下させる。継続学習(CL)では、ショートカットによる搾取の結果が持続し、強化されうる: 以前のタスクから受け継がれた重みは、最も容易に満たされるあらゆる特徴に対する偏見表現の再利用であり、過去の習慣が最適解をブロックする現象である認知的アインシュタイン効果を反映している。過去の侵食を忘れる破滅的な破壊とは対照的に、ショートカットによって引き起こされる剛性は、新しいものを取得するのを妨げている。 Einstellung Rigidity Index (ERI) は,3つの解釈可能なファセットを用いて,cue-inflatedパフォーマンスから真に転移する,コンパクトな診断指標である。一適応遅延(AD) (ii)パフォーマンス欠陥(PD)、及び (iii)相対的準最適特徴信頼性(SFR_rel) The two-phase CIFAR-100 CL benchmark with a intentionly spurious magenta patch in Phase 2 we evaluate Naive fine-tuning (SGD), online Elastic Weight Consolidation (EWC_on), Dark Experience Replay (DER++), Gradient Projection Memory (GPM), Deep Generative Replay (DGR)。これらの連続学習法全体で、CL法はScratch-T2ベースライン(負のAD)よりも早く精度の閾値に達するが、パッチされたショートカットクラス(正のPD)では最終精度はわずかに低い。パッチをマスキングすることでCL法の精度が向上し、Scratch-T2はわずかに小さくなり、負のSFR_relが得られる。このパターンは、パッチが有用なショートカットではなく、この設定でCLモデルのイントラクタとして機能していることを示している。

論文の概要: Diagnosing Shortcut-Induced Rigidity in Continual Learning: The Einstellung Rigidity Index (ERI)

関連論文リスト