Fugu-MT 論文翻訳(概要): Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games

論文の概要: Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games

arxiv url: http://arxiv.org/abs/2506.23276v1
Date: Sun, 29 Jun 2025 15:02:47 GMT
ステータス: 翻訳完了
システム内更新日: 2025-07-01 21:27:53.802848
Title: Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games
Title（参考訳）: 推論によって崩壊: 言語モデルがパブリックグッズゲームでフリーライダーになる
Authors: David Guzman Piedrahita, Yongjin Yang, Mrinmaya Sachan, Giorgia Ramponi, Bernhard Schölkopf, Zhijing Jin,
Abstract要約: 大規模言語モデルは、アライメント、堅牢性、安全なデプロイメントを保証する上で、いかに自己関心と集合的幸福のバランスをとるかが重要な課題である。我々は、行動経済学から制度的に選択した公共財ゲームに適応し、異なるLLMがいかに社会的ジレンマをナビゲートするかを観察することができる。意外なことに、o1シリーズのようなLCMの推論は、協調にかなり苦労している。
参考スコア（独自算出の注目度）: 87.5673042805229
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: As large language models (LLMs) are increasingly deployed as autonomous agents, understanding their cooperation and social mechanisms is becoming increasingly important. In particular, how LLMs balance self-interest and collective well-being is a critical challenge for ensuring alignment, robustness, and safe deployment. In this paper, we examine the challenge of costly sanctioning in multi-agent LLM systems, where an agent must decide whether to invest its own resources to incentivize cooperation or penalize defection. To study this, we adapt a public goods game with institutional choice from behavioral economics, allowing us to observe how different LLMs navigate social dilemmas over repeated interactions. Our analysis reveals four distinct behavioral patterns among models: some consistently establish and sustain high levels of cooperation, others fluctuate between engagement and disengagement, some gradually decline in cooperative behavior over time, and others rigidly follow fixed strategies regardless of outcomes. Surprisingly, we find that reasoning LLMs, such as the o1 series, struggle significantly with cooperation, whereas some traditional LLMs consistently achieve high levels of cooperation. These findings suggest that the current approach to improving LLMs, which focuses on enhancing their reasoning capabilities, does not necessarily lead to cooperation, providing valuable insights for deploying LLM agents in environments that require sustained collaboration. Our code is available at https://github.com/davidguzmanp/SanctSim
Abstract（参考訳）: 大規模言語モデル(LLM)が自律的エージェントとしてますます展開されるにつれて、その協力や社会的メカニズムの理解がますます重要になっている。特に、LCMが自己利益と集合的幸福のバランスをとることは、アライメント、堅牢性、安全なデプロイメントを保証する上で重要な課題です。本稿では,多エージェント LLM システムにおけるコスト削減の課題について考察する。これを研究するために,行動経済学から制度的に選択した公共財ゲームに適応し,異なるLLMが反復的な相互作用を通じてどのように社会的ジレンマをナビゲートするかを観察する。分析の結果,高いレベルの協力を維持し維持するモデル,エンゲージメントと非エンゲージメントの間に変動するモデル,時間とともに協調行動が徐々に低下するモデル,結果に関係なく厳格に一定の戦略に従うモデル,の4つの異なる行動パターンが明らかになった。意外なことに、o1シリーズのようなLLMの推論は協調に大きく苦労しているのに対し、従来のLLMは高いレベルの協力を継続的に達成している。これらの結果から,LLMの改善は必ずしも協力に結びつくものではなく,持続的な協力を必要とする環境にLLMエージェントを配置する上で貴重な知見をもたらすことが示唆された。私たちのコードはhttps://github.com/davidguzmanp/SanctSimで利用可能です。

論文の概要: Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games

関連論文リスト