Fugu-MT 論文翻訳(概要): Modality-Decoupled Online Recursive Editing

論文の概要: Modality-Decoupled Online Recursive Editing

arxiv url: http://arxiv.org/abs/2605.20273v1
Date: Tue, 19 May 2026 03:11:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-21 19:19:56.257986
Title: Modality-Decoupled Online Recursive Editing
Title（参考訳）: モダリティを分離したオンライン再帰編集
Authors: Siyuan Li, Youyuan Zhang, Fangming Liu, Jing Li,
Abstract要約: 生涯MLLM適応のためのモダリティ分離オンラインエディタM-OREを提案する。 M-OREは、統一された近近射影の定式化から派生し、シャーマン・モリソン再帰を伴う閉形式のアップデートを認める。テキストスタックとビジュアルプロジェクタのモジュール単位のローカリティ統計を保持し、視覚的に支配的な更新シェーピングを回避し、固定されたローランク編集サブスペースで連続的な更新を実行する。
参考スコア（独自算出の注目度）: 21.86720581525853
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Online model editing for multimodal large language models (MLLMs) requires assimilating a stream of corrections under tight compute and memory budgets. Yet editors developed for text-only LLMs often degrade on MLLMs: visually dominant activations skew the statistics that shape updates, causing cross-modal conflict, while sequential writes become entangled in a shared edit space and amplify long-horizon interference, causing inter-edit interference. To address these, we propose M-ORE, a modality-decoupled online recursive editor for lifelong MLLM adaptation. M-ORE is derived from a unified proximal-projection formulation and admits a closed-form update with a Sherman-Morrison recursion, yielding constant per-edit overhead. It maintains module-wise locality statistics for the text stack and the visual projector to avoid visually dominated update shaping and performs continual updates in a fixed orthogonal low-rank edit subspace via a Sherman-Morrison recursion to mitigate long-horizon interference. Experiments on multiple MLLM backbones and online editing benchmarks show that our M-ORE method consistently improves reliability, generality, and locality over strong baselines, while achieving favorable quality-efficiency scaling. Our code is publicly available at https://github.com/lab-klc/M-ORE.
Abstract（参考訳）: マルチモーダル大言語モデル(MLLM)のオンラインモデル編集には、厳密な計算とメモリ予算の下での修正のストリームを同化する必要がある。視覚的に支配的なアクティベーションは更新を形作る統計を歪め、モード間の競合を引き起こし、シーケンシャルな書き込みは共有編集空間に絡み合い、長い水平干渉を増幅し、相互干渉を引き起こす。そこで本稿では,M-ORE(M-ORE)を提案する。 M-OREは、統一された近近射影の定式化から派生し、シャーマン・モリソン再帰によるクローズドフォームの更新を認め、プロセッサごとのオーバーヘッドは一定である。テキストスタックとビジュアルプロジェクタのモジュールワイドなローカリティ統計を保持し、視覚的に支配される更新シェーピングを回避し、シャーマン・モリソン再帰を通じて固定直交の低ランク編集サブ空間で連続的な更新を行い、長い水平干渉を緩和する。複数のMLLMバックボーンとオンライン編集ベンチマークの実験から,我々のM-ORE法は信頼性,汎用性,局所性を常に向上するとともに,良好な品質・効率のスケーリングを実現していることがわかった。私たちのコードはhttps://github.com/lab-klc/M-ORE.comで公開されています。

論文の概要: Modality-Decoupled Online Recursive Editing

関連論文リスト