Fugu-MT 論文翻訳(概要): The Price of Thought: A Multilingual Analysis of Reasoning, Performance, and Cost of Negotiation in Large Language Models

論文の概要: The Price of Thought: A Multilingual Analysis of Reasoning, Performance, and Cost of Negotiation in Large Language Models

arxiv url: http://arxiv.org/abs/2510.08098v1
Date: Thu, 09 Oct 2025 11:36:38 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-10 17:54:15.042335
Title: The Price of Thought: A Multilingual Analysis of Reasoning, Performance, and Cost of Negotiation in Large Language Models
Title（参考訳）: 思考の価格:大規模言語モデルにおける推論・性能・コストの多言語分析
Authors: Sherzod Hakimov, Roland Bernard, Tim Leiber, Karl Osswald, Kristina Richert, Ruilin Yang, Raffaella Bernardi, David Schlangen,
Abstract要約: 交渉は、戦略的に判断し、対立者をモデル化し、競争との協力のバランスを取る能力を必要とするため、AIエージェントにとって根本的な課題である。商業LLMとオープンウェイトLLMの交渉能力に対する(LLM-)推論の効果を体系的に評価した最初の総合的研究を行った。 3つの多様な対話ゲームにまたがるセルフプレイ設定を用いて、性能とコストのトレードオフ、推論プロセスの言語一貫性、そしてモデルが提示する戦略的適応の性質を分析する。
参考スコア（独自算出の注目度）: 13.796041020333925
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Negotiation is a fundamental challenge for AI agents, as it requires an ability to reason strategically, model opponents, and balance cooperation with competition. We conduct the first comprehensive study systematically evaluating the effect of (LLM-)reasoning on the negotiation abilities of both commercial and open-weight LLMs, and do this across three languages. Using a self-play setup across three diverse dialogue games, we analyse trade-offs between performance and cost, the language consistency of reasoning processes, and the nature of strategic adaptation exhibited by models. Our findings show that enabling reasoning-that is, scaling test time compute-significantly improves negotiation outcomes by enhancing collaboration and helping models overcome task complexities, but comes at a substantial computational cost: reasoning improves GPT-5's performance by 31.4 % while increasing its cost by nearly 400 %. Most critically, we uncover a significant multilingual reasoning distinction: open-weight models consistently switch to English for their internal reasoning steps, even when negotiating in German or Italian (and thus possibly impacting potential explainability gains through the disclosure of reasoning traces), while leading commercial models maintain language consistency between their reasoning and final output.
Abstract（参考訳）: 交渉は、戦略的に判断し、対立者をモデル化し、競争との協力のバランスを取る能力を必要とするため、AIエージェントにとって根本的な課題である。商業LLMとオープンウェイトLLMの交渉能力に対する(LLM-)推論の効果を体系的に評価した最初の総合的研究を行い,これを3言語にわたって実施した。 3つの多様な対話ゲームにまたがるセルフプレイ設定を用いて、性能とコストのトレードオフ、推論プロセスの言語一貫性、そしてモデルが提示する戦略的適応の性質を分析する。本研究の結果から,テストタイムのスケールアップは,作業の複雑度を克服する上で,協調性を高め,モデルを支援することによって交渉成果を著しく向上するが,計算コストは相当に高く,推理によりGPT-5の性能は31.4%向上し,コストは400%近く向上することがわかった。オープンウェイトモデルは、ドイツ語やイタリア語で交渉しても、常に英語に切り替える(したがって、推論トレースの開示を通じて、潜在的な説明可能性に影響を及ぼす可能性がある)一方で、商業モデルは、推論と最終的なアウトプットの間の言語一貫性を維持している。

論文の概要: The Price of Thought: A Multilingual Analysis of Reasoning, Performance, and Cost of Negotiation in Large Language Models

関連論文リスト