Fugu-MT 論文翻訳(概要): DC-Merge: Improving Model Merging with Directional Consistency

論文の概要: DC-Merge: Improving Model Merging with Directional Consistency

arxiv url: http://arxiv.org/abs/2603.06242v1
Date: Fri, 06 Mar 2026 13:04:25 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 13:17:45.776219
Title: DC-Merge: Improving Model Merging with Directional Consistency
Title（参考訳）: DC-Merge: 方向性整合性によるモデルマージの改善
Authors: Han-Chen Zhang, Zi-Hao Zhou, Mao-Lin Luo, Shimin Di, Min-Ling Zhang, Tong Wei,
Abstract要約: DC-Mergeは方向整合モデルマージの手法である。各タスクベクトルのエネルギー分布は、その特異値を滑らかにすることで均衡する。ビジョンとビジョン言語ベンチマークの実験では、DC-Mergeは一貫して最先端のパフォーマンスを実現している。
参考スコア（独自算出の注目度）: 62.02490833158024
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model merging aims to integrate multiple task-adapted models into a unified model that preserves the knowledge of each task. In this paper, we identify that the key to this knowledge retention lies in maintaining the directional consistency of singular spaces between merged multi-task vector and individual task vectors. However, this consistency is frequently compromised by two issues: i) an imbalanced energy distribution within task vectors, where a small fraction of singular values dominate the total energy, leading to the neglect of semantically important but weaker components upon merging, and ii) the geometric inconsistency of task vectors in parameter space, which causes direct merging to distort their underlying directional geometry. To address these challenges, we propose DC-Merge, a method for directional-consistent model merging. It first balances the energy distribution of each task vector by smoothing its singular values, ensuring all knowledge components are adequately represented. These energy-balanced vectors are then projected onto a shared orthogonal subspace to align their directional geometries with minimal reconstruction error. Finally, the aligned vectors are aggregated in the shared orthogonal subspace and projected back to the original parameter space. Extensive experiments on vision and vision-language benchmarks show that DC-Merge consistently achieves state-of-the-art performance in both full fine-tuning and LoRA settings. The implementation code is available at https://github.com/Tobeginwith/DC-Merge.
Abstract（参考訳）: モデルマージは、複数のタスク適応モデルを統合することを目的としており、各タスクの知識を保存する統一モデルである。本稿では、この知識保持の鍵は、統合されたマルチタスクベクトルと個別タスクベクトルの間の特異空間の方向整合性を維持することである。しかし、この一貫性はしばしば2つの問題によって妥協される。一タスクベクトル内の不均衡エネルギー分布であって、特異値のごく一部が総エネルギーを支配し、統合時に意味的に重要でより弱い成分を無視するものであること。二パラメータ空間におけるタスクベクトルの幾何学的不整合により、直接マージし、その基礎となる方向幾何学を歪めること。これらの課題に対処するために、方向整合モデルマージ法であるDC-Mergeを提案する。まず、各タスクベクトルのエネルギー分布を、その特異な値を滑らかにすることでバランスさせ、すべての知識成分が適切に表現されることを保証する。これらのエネルギー平衡ベクトルは共有直交部分空間に投影され、その方向ジオメトリを最小の再構成誤差で整列する。最後に、整列ベクトルは共有直交部分空間に集約され、元のパラメータ空間に投影される。ビジョンとヴィジュアル言語ベンチマークの大規模な実験により、DC-Mergeはフル微調整とLORA設定の両方で常に最先端のパフォーマンスを達成している。実装コードはhttps://github.com/Tobeginwith/DC-Merge.comで公開されている。

論文の概要: DC-Merge: Improving Model Merging with Directional Consistency

関連論文リスト