Fugu-MT 論文翻訳(概要): K-Merge: Online Continual Merging of Adapters for On-device Large Language Models

論文の概要: K-Merge: Online Continual Merging of Adapters for On-device Large Language Models

arxiv url: http://arxiv.org/abs/2510.13537v1
Date: Wed, 15 Oct 2025 13:32:25 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-16 20:13:28.679343
Title: K-Merge: Online Continual Merging of Adapters for On-device Large Language Models
Title（参考訳）: K-Merge: オンデバイス大規模言語モデルのためのアダプタのオンライン連続マージ
Authors: Donald Shenaj, Ondrej Bohdal, Taha Ceritli, Mete Ozay, Pietro Zanuttigh, Umberto Michieli,
Abstract要約: 大規模言語モデル(LLM)は、リソースの厳しい制約の下で様々な下流タスクをサポートする。最近の研究は、複数のローランドアダプタ(LoRA)を1つに融合させるモデルマージ技術について検討している。本稿では,データフリーで効率的なLoRAの選択とマージを行う手法を提案する。
参考スコア（独自算出の注目度）: 42.53168201980569
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: On-device deployment of Large Language Models (LLMs) frequently leverages Low-Rank Adapters (LoRAs) to support diverse downstream tasks under tight resource constraints. To address the limited storage capacity of mobile devices, recent works have explored model merging techniques to fuse multiple LoRAs into a single one. In practice, however, LoRAs are often delivered incrementally, as users request support for new tasks (e.g., novel problem types or languages). This scenario introduces a new challenge: on-device online continual merging, where the objective is to incorporate new LoRAs while preserving the performance on previously supported tasks. In this paper, we propose a data-free and computationally efficient strategy for selecting and merging LoRAs when a new one becomes available, assuming the device can store only a limited number of adapters. Extensive experiments across real-world tasks demonstrate the superiority of our approach compared to alternative strategies while adhering to the storage budget and compute limitations of on-device settings.
Abstract（参考訳）: LLM(Large Language Models)のオンデバイスデプロイメントでは、リソースの厳しい制約の下でさまざまな下流タスクをサポートするために、ローランクアダプタ(LoRA)が頻繁に使用される。モバイルデバイスのストレージ容量の制限に対処するため、最近の研究では、複数のLoRAを1つに融合するモデルマージ技術を模索している。しかし実際には、ユーザが新しいタスク(新しい問題タイプや言語など)のサポートを要求するため、LoRAは徐々に配信されることが多い。このシナリオでは、オンデバイスでのオンライン連続的なマージという新たな課題が導入されている。本稿では,データフリーで効率的なLoRAの選択とマージを行う手法を提案する。実世界のタスクにわたる大規模な実験は、ストレージ予算とオンデバイス設定の計算制限に固執しながら、代替戦略と比較して、我々のアプローチの優位性を実証している。

論文の概要: K-Merge: Online Continual Merging of Adapters for On-device Large Language Models

関連論文リスト