Fugu-MT 論文翻訳(概要): DSCA: Dynamic Subspace Concept Alignment for Lifelong VLM Editing

論文の概要: DSCA: Dynamic Subspace Concept Alignment for Lifelong VLM Editing

arxiv url: http://arxiv.org/abs/2604.07965v1
Date: Thu, 09 Apr 2026 08:25:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-10 18:34:05.802058
Title: DSCA: Dynamic Subspace Concept Alignment for Lifelong VLM Editing
Title（参考訳）: DSCA: 生涯VLM編集のための動的サブスペース概念アライメント
Authors: Gyanendra Das, Sai Satyam Jena,
Abstract要約: 生涯の編集は難しい作業であり、これまで学んだ概念を乱す傾向がある。現在の手法では、知識を構造的に分離するのではなく、最適化によって編集をアルゴリズムで制御している。本稿では,この制限を緩和する動的部分空間概念アライメント(DSCA)を提案する。本手法は,1回の編集で98%,1000回の編集後に95%以上,幻覚を3～5%,連続的なチューニングチューニングベンチマークで最高の後方転送(BWT)スコアが得られた。
参考スコア（独自算出の注目度）: 1.6830191160943109
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model editing aims to update knowledge to add new concepts and change relevant information without retraining. Lifelong editing is a challenging task, prone to disrupting previously learned concepts, especially for Vision Language Models (VLMs), because sequential edits can lead to degraded reasoning and cross modal misalignment. Existing VLM knowledge editing methods based on gated adapters, activation edits, and parameter merging techniques address catastrophic forgetting seen in full fine tuning; however, they still operate in the shared representation space of the VLM, where concepts are entangled, so edits interfere with other non relevant concepts. We hypothesize that this instability persists because current methods algorithmically control edits via optimization rather than structurally separating knowledge. We introduce Dynamic Subspace Concept Alignment (DSCA) which by design mitigates this limitation by decomposing the representation space into a set of orthogonal semantic subspaces and proposing edits only in those transformed spaces. These subspaces are obtained through incremental clustering and PCA on joint vision language representations. This process structurally isolates concepts, enabling precise, non interfering edits by turning isolation from a soft training objective into an architectural property. The surgical edits are guided by a multi term loss function for maintaining task fidelity, edit locality, and cross modal alignment. With the base model frozen, our method achieves 98 percent single edit success, remains over 95 percent after 1000 sequential edits, lowers hallucination by 3 to 5 percent, and achieves the best backward transfer (BWT) scores on continual instruction tuning benchmarks. Extensive experiments demonstrate DSCA state of the art stability and knowledge retention capability in continual lifelong editing across various datasets and benchmarks.
Abstract（参考訳）: モデル編集は知識を更新し、新しい概念を追加し、関連する情報を再訓練せずに変更することを目的としている。生涯編集は難しい課題であり、特に視覚言語モデル(VLM)では、逐次的な編集が劣化した推論と横断的な修正に繋がる可能性があるため、以前に学んだ概念を混乱させる傾向がある。ゲートアダプタ、アクティベーション編集、パラメータマージ技術に基づく既存のVLM知識編集手法は、完全な微調整で見られる破滅的な忘れに対処するが、概念が絡み合っているVLMの共有表現空間では、編集が他の非関連する概念に干渉する。この不安定性は、現在の手法が知識を構造的に分離するのではなく、最適化によって編集をアルゴリズム的に制御しているため持続する、という仮説を立てる。動的部分空間概念アライメント(DSCA)を導入し、表現空間を直交意味部分空間の集合に分解し、それらの変換空間にのみ編集を提案することにより、この制限を緩和する。これらのサブスペースは、統合視覚言語表現上のインクリメンタルクラスタリングとPCAによって得られる。このプロセスは概念を構造的に分離し、ソフトトレーニング目標からの分離をアーキテクチャ特性に変換することによって、正確に非干渉的な編集を可能にする。手術編集は、タスクの忠実性を維持し、局所性を編集し、横断的なアライメントを維持するための多項損失関数によってガイドされる。基本モデルの凍結により,本手法は88%の単一編集成功,1000回の逐次編集後の95%以上を達成し,幻覚率を3～5%低下させ,連続的な指導訓練ベンチマークにおいて最高の後方転送(BWT)スコアを得る。大規模な実験では、様々なデータセットやベンチマークにわたる連続的な生涯編集において、DSCAの安定性と知識保持能力が実証されている。

論文の概要: DSCA: Dynamic Subspace Concept Alignment for Lifelong VLM Editing

関連論文リスト