Fugu-MT 論文翻訳(概要): Efficient and Interpretable Multi-Agent LLM Routing via Ant Colony Optimization

論文の概要: Efficient and Interpretable Multi-Agent LLM Routing via Ant Colony Optimization

arxiv url: http://arxiv.org/abs/2603.12933v1
Date: Fri, 13 Mar 2026 12:26:05 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-16 17:38:12.078929
Title: Efficient and Interpretable Multi-Agent LLM Routing via Ant Colony Optimization
Title（参考訳）: Ant Colony Optimization による効率的な多エージェント LLM ルーティング
Authors: Xudong Wang, Chaoning Zhang, Jiaquan Zhang, Chenghao Li, Qigan Sun, Sung-Ho Bae, Peng Wang, Ning Xie, Jie Zou, Yang Yang, Hengtao Shen,
Abstract要約: マルチエージェントシステム(MAS)のための効率的かつ解釈可能なルーティングフレームワークAMRO-Sを提案する。 AMRO-Sは、意味条件付き経路選択問題としてMASルーティングをモデル化し、3つのキーメカニズムを通してルーティング性能を向上させる。 5つの公開ベンチマークと高速ストレステストによる大規模な実験により、AMRO-Sは強いルーティングベースラインに対する品質-コストトレードオフを一貫して改善することを示した。
参考スコア（独自算出の注目度）: 58.59491516762626
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large Language Model (LLM)-driven Multi-Agent Systems (MAS) have demonstrated strong capability in complex reasoning and tool use, and heterogeneous agent pools further broaden the quality--cost trade-off space. Despite these advances, real-world deployment is often constrained by high inference cost, latency, and limited transparency, which hinders scalable and efficient routing. Existing routing strategies typically rely on expensive LLM-based selectors or static policies, and offer limited controllability for semantic-aware routing under dynamic loads and mixed intents, often resulting in unstable performance and inefficient resource utilization. To address these limitations, we propose AMRO-S, an efficient and interpretable routing framework for Multi-Agent Systems (MAS). AMRO-S models MAS routing as a semantic-conditioned path selection problem, enhancing routing performance through three key mechanisms: First, it leverages a supervised fine-tuned (SFT) small language model for intent inference, providing a low-overhead semantic interface for each query; second, it decomposes routing memory into task-specific pheromone specialists, reducing cross-task interference and optimizing path selection under mixed workloads; finally, it employs a quality-gated asynchronous update mechanism to decouple inference from learning, optimizing routing without increasing latency. Extensive experiments on five public benchmarks and high-concurrency stress tests demonstrate that AMRO-S consistently improves the quality--cost trade-off over strong routing baselines, while providing traceable routing evidence through structured pheromone patterns.
Abstract（参考訳）: 大規模言語モデル (LLM) 駆動マルチエージェントシステム (MAS) は複雑な推論やツールの使用において強力な能力を示し、異種エージェントプールは品質とコストのトレードオフ空間をさらに広げている。これらの進歩にもかかわらず、現実のデプロイメントは、しばしば高い推論コスト、レイテンシ、および制限された透明性によって制約され、スケーラブルで効率的なルーティングを妨げる。既存のルーティング戦略は、通常、高価なLCMベースのセレクタや静的ポリシーに依存し、動的負荷と混在したインテント下でのセマンティックアウェアルーティングの制限された制御性を提供し、しばしば不安定なパフォーマンスと非効率なリソース利用をもたらす。これらの制約に対処するため,マルチエージェントシステム(MAS)のための効率的かつ解釈可能なルーティングフレームワークであるAMRO-Sを提案する。 AMRO-Sは、セマンティック条件付きパス選択問題としてMASルーティングをモデル化し、ルーティング性能を3つの主要なメカニズムを通じて強化する: まず、インテント推論のために教師付き微調整(SFT)の小さな言語モデルを利用し、クエリ毎に低オーバーヘッドのセマンティックインターフェースを提供し、次に、ルーティングメモリをタスク固有のフェロモンスペシャリストに分解し、クロスタスクの干渉を減らし、混合ワークロード下でパス選択を最適化する。 5つの公開ベンチマークと高速ストレステストによる大規模な実験により、AMRO-Sは、構造化フェロモンパターンによるトレース可能なルーティングエビデンスを提供しながら、強いルーティングベースラインに対する品質とコストのトレードオフを一貫して改善することを示した。

論文の概要: Efficient and Interpretable Multi-Agent LLM Routing via Ant Colony Optimization

関連論文リスト