Fugu-MT 論文翻訳(概要): Black-box Model Merging for Language-Model-as-a-Service with Massive Model Repositories

論文の概要: Black-box Model Merging for Language-Model-as-a-Service with Massive Model Repositories

arxiv url: http://arxiv.org/abs/2509.12951v1
Date: Tue, 16 Sep 2025 10:55:50 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-17 17:50:53.045298
Title: Black-box Model Merging for Language-Model-as-a-Service with Massive Model Repositories
Title（参考訳）: 大規模モデルリポジトリを用いた言語モデル・アズ・ア・サービスのためのブラックボックスモデルマージ
Authors: Shilian Chen, Jie Zhou, Tianyu Huai, Yujiang Lu, Junsong Li, Bihao Zhan, Qianjun Pan, Yutao Yang, Xin Li, Qin Chen, Hang Yan, Liang He,
Abstract要約: 進化的アルゴリズム(Evo-Merging)に基づく微分自由最適化フレームワークを提案する。提案手法は,(1) モデル間の不適切な情報や冗長な情報を識別・フィルタリングする疎結合型デノベーション,(2) 関連モデルに対する最適な組合せ重み付けを動的に計算するシグナック・アウェア・スケーリングの2つの重要な要素から構成される。提案手法は,様々なタスクにおける最先端の成果を達成し,既存の強靭なベースラインを著しく上回っている。
参考スコア（独自算出の注目度）: 21.899117703417517
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Model merging refers to the process of integrating multiple distinct models into a unified model that preserves and combines the strengths and capabilities of the individual models. Most existing approaches rely on task vectors to combine models, typically under the assumption that model parameters are accessible. However, for extremely large language models (LLMs) such as GPT-4, which are often provided solely as black-box services through API interfaces (Language-Model-as-a-Service), model weights are not available to end users. This presents a significant challenge, which we refer to as black-box model merging (BMM) with massive LLMs. To address this challenge, we propose a derivative-free optimization framework based on the evolutionary algorithm (Evo-Merging) that enables effective model merging using only inference-time API queries. Our method consists of two key components: (1) sparsity-based denoising, designed to identify and filter out irrelevant or redundant information across models, and (2) sign-aware scaling, which dynamically computes optimal combination weights for the relevant models based on their performance. We also provide a formal justification, along with a theoretical analysis, for our asymmetric sparsification. Extensive experimental evaluations demonstrate that our approach achieves state-of-the-art results on a range of tasks, significantly outperforming existing strong baselines.
Abstract（参考訳）: モデルマージ(英: Model merging)とは、複数の異なるモデルを統一されたモデルに統合し、個々のモデルの強みと能力を保存・結合するプロセスを指す。既存のアプローチの多くは、モデルパラメータがアクセス可能であるという前提の下で、モデルを組み合わせるためにタスクベクトルに依存している。しかし、APIインターフェース(Language-Model-as-a-Service)を通じてブラックボックスサービスとしてのみ提供されるGPT-4のような非常に大きな言語モデル(LLM)では、モデルの重み付けはエンドユーザには利用できない。これは、大規模なLLMとブラックボックスモデルマージ(BMM)と呼ばれる重要な課題である。この課題に対処するために、推論時APIクエリのみを用いて効果的なモデルマージを可能にする進化的アルゴリズム(Evo-Merging)に基づく微分自由最適化フレームワークを提案する。提案手法は,(1) モデル間の不適切な情報や冗長な情報を識別・フィルタリングする疎結合型デノベーション,(2) 関連モデルに対する最適な組み合わせ重み付けを動的に計算するシグナック・アウェア・スケーリングの2つの重要な要素から構成される。我々はまた、我々の非対称なスパーシフィケーションのための公式な正当化と理論解析も提供する。大規模な実験により,本手法は様々なタスクにおける最先端の成果を達成し,既存の強靭なベースラインを著しく上回る結果となった。

論文の概要: Black-box Model Merging for Language-Model-as-a-Service with Massive Model Repositories

関連論文リスト