Fugu-MT 論文翻訳(概要): Towards Direct Latent-Space Synthesis for Parallel Branches in LLM-Agent Workflows

論文の概要: Towards Direct Latent-Space Synthesis for Parallel Branches in LLM-Agent Workflows

arxiv url: http://arxiv.org/abs/2606.14672v1
Date: Fri, 12 Jun 2026 17:39:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-15 16:00:43.016089
Title: Towards Direct Latent-Space Synthesis for Parallel Branches in LLM-Agent Workflows
Title（参考訳）: LLM-Agentワークフローにおける並列分岐の直接遅延空間合成に向けて
Authors: Shikun Liu, Mufei Li, Dongqi Fu, Haoyu Wang, Yinglong Xia, Hong Li, Hong Yan, Pan Li,
Abstract要約: 大規模言語モデルはエージェントシステムの実行エンジンとしてますます機能するが、それでもシーケンシャルなテキストインターフェースを通じてコンテキストを消費する。本研究では,並列ワーカーエージェントが生成するKVキャッシュを直接消費するプラグイン・アンド・プレイフレームワークであるParallel-Synthesisを紹介する。我々は、並列キャッシュコンテキストにシンセサイザーを露出するデータを用いて並列合成を訓練し、キャッシュされた分岐間の集約を教え、標準テキスト結合に基づく合成から推論の振る舞いを蒸留する。
参考スコア（独自算出の注目度）: 32.86656152626106
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models increasingly serve as execution engines for agentic systems, yet they still consume context through a sequential text interface. This creates a mismatch with modern structured agent workflows, in which independent branches explore subtasks, retrieve evidence, or generate candidate solutions before a final synthesis step. Existing systems typically merge these branches by concatenating their textual outputs, which discards the parallel structure and incurs redundant prefill computation. In this work, we introduce Parallel-Synthesis, a plug-and-play framework that enables a synthesizer to directly consume the KV caches produced by parallel worker agents. Parallel-Synthesis combines a cache mapper that calibrates independently generated branch caches with a fine-tuned synthesizer adapter that enables generation from this non-sequential cache interface. We train Parallel-Synthesis using data that exposes the synthesizer to parallel cache contexts, teaches aggregation across cached branches, and distills reasoning behavior from standard text-concatenation-based synthesis. Across nine downstream datasets spanning math, science QA, code generation, GAIA, and multi-agent database diagnosis, Parallel-Synthesis matches or outperforms text-based synthesis on seven datasets and remains close on the other two. It also reduces time-to-first-token by 2.5x-11x, suggesting that direct cache-based synthesis is a promising interface for more native and efficient synthesis over parallel agent branches.
Abstract（参考訳）: 大規模言語モデルはエージェントシステムの実行エンジンとしてますます機能するが、それでもシーケンシャルなテキストインターフェースを通じてコンテキストを消費する。これにより、独立したブランチがサブタスクを探索したり、エビデンスを検索したり、最終的な合成ステップの前に候補解を生成するという、現代的な構造化エージェントワークフローとのミスマッチが生じる。既存のシステムはテキスト出力を結合することでこれらのブランチをマージし、並列構造を捨て、冗長なプリフィル計算を発生させる。本研究では,並列ワーカーエージェントが生成するKVキャッシュを直接消費するプラグイン・アンド・プレイフレームワークであるParallel-Synthesisを紹介する。 Parallel-Synthesisは、独立に生成されたブランチキャッシュを校正するキャッシュマッパーと、この非シーケンスキャッシュインターフェースから生成可能な微調整のシンセサイザーアダプタを組み合わせる。我々は、並列キャッシュコンテキストにシンセサイザーを露出するデータを用いて並列合成を訓練し、キャッシュされた分岐間の集約を教え、標準テキスト結合に基づく合成から推論の振る舞いを蒸留する。数学、科学QA、コード生成、GAIA、マルチエージェントデータベース診断にまたがる9つの下流データセットで、Parallel-Synthesisは7つのデータセットでテキストベースの合成にマッチするか、上回っている。直接キャッシュベースの合成は、並列エージェントブランチよりもよりネイティブで効率的な合成のための有望なインターフェースであることを示唆している。

論文の概要: Towards Direct Latent-Space Synthesis for Parallel Branches in LLM-Agent Workflows

関連論文リスト