Fugu-MT 論文翻訳(概要): Unlocking the Potential of Continual Model Merging: An ODE Perspective

論文の概要: Unlocking the Potential of Continual Model Merging: An ODE Perspective

arxiv url: http://arxiv.org/abs/2605.19409v1
Date: Tue, 19 May 2026 06:03:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-20 15:03:09.155556
Title: Unlocking the Potential of Continual Model Merging: An ODE Perspective
Title（参考訳）: 連続モデルマージの可能性の解き放つ: ODE の視点から
Authors: Lihong Lin, Haidong Kang,
Abstract要約: 連続モデルマージング(CMM)は、逐次到着タスク間で基礎モデルの迅速なカスタマイズを可能にする。既存のマージルールには、以前に学習した能力と新たにマージされたモデルの間の学習能力の割り当てに関する明確な制御性がない。本稿では,時間依存性の速度場を統合することで,CMMに適した新しいODE-driven Merging (ODE-M)を提案する。
参考スコア（独自算出の注目度）: 0.42970700836450487
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continual Model Merging (CMM) enables rapid customization of foundation models across sequentially arriving tasks, offering a scalable alternative to repeated retraining. However, existing merging rules lack explicit controllability over the allocation of learning capacity between previously learned capabilities and newly merged models. Consequently, as tasks are merged sequentially, this deficiency accumulates into severe forgetting, particularly in scenarios with heterogeneous task importance, where performance allocation becomes highly inconsistent. The key reason can be attributed to the fact that previous methods treat each task model as an isolated parameter point and apply fixed algebraic combinations, rather than explicitly constructing a transition that respects how independently trained models can be connected in parameter space. Motivated by mode connectivity, we assume that desirable merged models lie on low loss connecting paths, and that continual merging should follow such paths without crossing loss barriers that induce forgetting. Grounded in these insights, we propose a novel ODE-driven Merging (ODE-M) tailored for CMM that traces such a path by integrating a time-dependent velocity field and enforcing barrier constraints to prevent loss-increasing steps. Extensive experiments demonstrate that ODE-M achieves state-of-the-art performance compared to its competitors across mainstream CMM benchmarks.
Abstract（参考訳）: 連続モデルマージ(CMM)は、逐次到着するタスク間で基礎モデルの迅速なカスタマイズを可能にし、反復的な再トレーニングに代わるスケーラブルな代替手段を提供する。しかし、既存のマージルールには、以前に学習した能力と新たにマージされたモデルの間の学習能力の割り当てに関する明確な制御性がない。その結果、タスクが逐次マージされるにつれて、この不足は、特に不均一なタスクが重要となるシナリオにおいて、特にパフォーマンスの割り当てが非常に矛盾するシナリオにおいて、深刻な忘れに陥る。主要な理由は、独立に訓練されたモデルがパラメータ空間内でどのように接続できるかを明示するトランジションを構築するのではなく、従来の手法がそれぞれのタスクモデルを独立したパラメータポイントとして扱い、固定された代数的組み合わせを適用するという事実に起因している。モード接続によるモチベーションにより、所望のマージモデルは低損失接続経路上にあり、連続的なマージは、忘れを誘発する損失障壁を渡さずにそのような経路を辿るべきであると仮定する。これらの知見に基づいて、時間依存性の速度場を統合し、損失増加防止のために障壁制約を強制することにより、CMMに適した新しいODE-driven Merging(ODE-M)を提案する。大規模な実験により、ODE-Mは主要なCMMベンチマークで比較すると、最先端のパフォーマンスを実現していることが示された。

論文の概要: Unlocking the Potential of Continual Model Merging: An ODE Perspective

関連論文リスト