Fugu-MT 論文翻訳(概要): Vision Transformers that Never Stop Learning

論文の概要: Vision Transformers that Never Stop Learning

arxiv url: http://arxiv.org/abs/2603.07787v1
Date: Sun, 08 Mar 2026 20:07:43 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:15.215976
Title: Vision Transformers that Never Stop Learning
Title（参考訳）: 学習をやめない視覚変換器
Authors: Caihao Sun, Mingqi Yuan, Shiyuan Wang, Jiayu Chen,
Abstract要約: 視覚変換器(ViT)の可塑性損失に関する系統的研究について述べる。分析の結果,重み付きアテンションモジュールは可塑性損失を増大させる不安定性を示し,フィードフォワードネットワークモジュールはより顕著に劣化することがわかった。本稿では,アテンションモジュールのオンライン曲率推定値を用いて方向を適応的に変換することで,可塑性を保った幾何学的認識モデルであるARROWを提案する。
参考スコア（独自算出の注目度）: 13.804234595369058
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Loss of plasticity refers to the progressive inability of a model to adapt to new tasks and poses a fundamental challenge for continual learning. While this phenomenon has been extensively studied in homogeneous neural architectures, such as multilayer perceptrons, its mechanisms in structurally heterogeneous, attention-based models such as Vision Transformers (ViTs) remain underexplored. In this work, we present a systematic investigation of loss of plasticity in ViTs, including a fine-grained diagnosis using local metrics that capture parameter diversity and utilization. Our analysis reveals that stacked attention modules exhibit increasing instability that exacerbates plasticity loss, while feed-forward network modules suffer even more pronounced degradation. Furthermore, we evaluate several approaches for mitigating plasticity loss. The results indicate that methods based on parameter re-initialization fail to recover plasticity in ViTs, whereas approaches that explicitly regulate the update process are more effective. Motivated by this insight, we propose ARROW, a geometry-aware optimizer that preserves plasticity by adaptively reshaping gradient directions using an online curvature estimate for the attention module. Extensive experiments show that ARROW effectively improves plasticity and maintains better performance on newly encountered tasks.
Abstract（参考訳）: 可塑性の喪失は、モデルが新しいタスクに適応できないことを示し、継続的な学習に根本的な課題を提起する。この現象は多層パーセプトロンのような均質な神経構造において広く研究されているが、その構造的に異質な視覚変換器(ViTs)のような注意に基づくモデルにおけるメカニズムは未解明のままである。本研究では, パラメータの多様性と利用率を計測する局所的指標を用いた微粒化診断を含む, ViTsの可塑性損失の系統的研究を行う。分析の結果,重み付きアテンションモジュールは可塑性損失を増大させる不安定性を示す一方,フィードフォワードネットワークモジュールはより顕著に劣化することが明らかとなった。さらに, 可塑性損失軽減のためのいくつかの手法について検討した。その結果、パラメータ再初期化に基づく手法は、ViTの可塑性回復に失敗するが、更新プロセスを明示的に規制するアプローチはより効果的であることが示唆された。この知見により,アテンションモジュールのオンライン曲率推定値を用いて勾配方向を適応的に整形することで,塑性を保ち得る幾何認識最適化器であるARROWを提案する。大規模な実験により、ARROWは塑性を効果的に改善し、新しく遭遇したタスクにおいてより良い性能を維持することが示されている。

論文の概要: Vision Transformers that Never Stop Learning

関連論文リスト