Fugu-MT 論文翻訳(概要): RecoAtlas: From Semantic Plausibility to Set-Level Utility in LLM Recommendation Agents

論文の概要: RecoAtlas: From Semantic Plausibility to Set-Level Utility in LLM Recommendation Agents

arxiv url: http://arxiv.org/abs/2605.18805v1
Date: Mon, 11 May 2026 18:55:32 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-20 21:37:32.344557
Title: RecoAtlas: From Semantic Plausibility to Set-Level Utility in LLM Recommendation Agents
Title（参考訳）: レコアトラス:LLM勧告剤のセマンティックプラズビリティからセットレベルユーティリティへ
Authors: Imad Aouali, Flavian Vasile, Otmane Sakhi, Alexandre Gilotte, Benjamin Heymann,
Abstract要約: Recommendation Atlasは、行動基準付きショッピングエージェントを評価するためのベンチマークである。 RecoAtlasはエージェントシステムの有意義なベンチマークの鍵となる特性を示す。
参考スコア（独自算出の注目度）: 44.66462874971054
License: http://creativecommons.org/licenses/by/4.0/
Abstract: LLM recommendation agents increasingly produce structured recommendation reports: sets of items accompanied by natural-language justifications. Yet existing evaluations often reduce this setting to reranking small shortlisted candidate sets or judge reports mainly by semantic plausibility. We introduce Recommendation Atlas (Agentic Tool-Level Assessment for Shopping), or RecoAtlas, a benchmark and toolkit for evaluating shopping agents with behavior-grounded metrics. RecoAtlas complements held-out interaction metrics with learned utility proxies for relevance, complementarity, and diversity derived from interaction data, while separately measuring semantic coherence and explanation quality. Its controlled tool environment exposes agents to either semantic, behavior-aligned, or faulty tools, enabling diagnosis of whether performance gains arise from stronger reasoning, better signals, or more effective tool-use policies. Across controlled experiments, we show that RecoAtlas exhibits key properties of a meaningful benchmark for agentic systems: performance scales with model capacity and test-time compute, improves with stronger and better-aligned tools, degrades under noisy or misaligned signals, and reveals that semantic plausibility does not necessarily capture behavior-grounded utility. RecoAtlas provides a foundation for developing and evaluating shopping assistants that optimize not only for plausible recommendations, but also for coherent, behaviorally grounded recommendation sets.
Abstract（参考訳）: LLMレコメンデーションエージェントはますます構造化されたレコメンデーションレポートを生成する。しかし、既存の評価では、この設定を、主にセマンティックな妥当性によって、小さなショートリストの候補セットや判断報告に再配置することが多い。 Recommendation Atlas (Agentic Tool-Level Assessment for Shopping) またはRecoAtlas(英語版)は、ショッピングエージェントを行動グラウンドメトリクスで評価するためのベンチマークおよびツールキットである。 RecoAtlasは、セマンティックコヒーレンスと説明品質を別々に測定しながら、関係性、相補性、および相互作用データから派生した多様性に関する学習されたユーティリティプロキシで、保持された相互作用のメトリクスを補完する。コントロールされたツール環境は、エージェントをセマンティック、ビヘイビアアライメント、あるいは障害ツールに公開し、より強力な推論、より良いシグナル、より効果的なツール利用ポリシーからパフォーマンスが向上するかどうかの診断を可能にする。モデルキャパシティとテスト時間計算によるパフォーマンススケールの改善,より強固で整合性の高いツールの改良,ノイズや不整合な信号の下での劣化,セマンティックな妥当性が必ずしも振る舞いを捉えているとは限らないこと,などだ。 RecoAtlasはショッピングアシスタントの開発と評価のための基盤を提供する。

論文の概要: RecoAtlas: From Semantic Plausibility to Set-Level Utility in LLM Recommendation Agents

関連論文リスト