Fugu-MT 論文翻訳(概要): AIM-Bench: Evaluating Decision-making Biases of Agentic LLM as Inventory Manager

論文の概要: AIM-Bench: Evaluating Decision-making Biases of Agentic LLM as Inventory Manager

arxiv url: http://arxiv.org/abs/2508.11416v1
Date: Fri, 15 Aug 2025 11:38:19 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-18 14:51:23.937901
Title: AIM-Bench: Evaluating Decision-making Biases of Agentic LLM as Inventory Manager
Title（参考訳）: AIM-Bench: インベントリマネージャとしてのエージェントLDMの意思決定バイアスの評価
Authors: Xuhua Zhao, Yuxuan Xie, Caihua Chen, Yuxiang Sun,
Abstract要約: AIM-Benchは、不確実なサプライチェーン管理シナリオにおいて、大規模言語モデル(LLM)の意思決定行動を評価するために設計された新しいベンチマークである。以上の結果から, LLMは人体とよく似た, 決定バイアスの度合いが異なることが明らかとなった。
参考スコア（独自算出の注目度）: 9.21215885702746
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in mathematical reasoning and the long-term planning capabilities of large language models (LLMs) have precipitated the development of agents, which are being increasingly leveraged in business operations processes. Decision models to optimize inventory levels are one of the core elements of operations management. However, the capabilities of the LLM agent in making inventory decisions in uncertain contexts, as well as the decision-making biases (e.g. framing effect, etc.) of the agent, remain largely unexplored. This prompts concerns regarding the capacity of LLM agents to effectively address real-world problems, as well as the potential implications of biases that may be present. To address this gap, we introduce AIM-Bench, a novel benchmark designed to assess the decision-making behaviour of LLM agents in uncertain supply chain management scenarios through a diverse series of inventory replenishment experiments. Our results reveal that different LLMs typically exhibit varying degrees of decision bias that are similar to those observed in human beings. In addition, we explored strategies to mitigate the pull-to-centre effect and the bullwhip effect, namely cognitive reflection and implementation of information sharing. These findings underscore the need for careful consideration of the potential biases in deploying LLMs in Inventory decision-making scenarios. We hope that these insights will pave the way for mitigating human decision bias and developing human-centred decision support systems for supply chains.
Abstract（参考訳）: 大規模言語モデル(LLM)の数学的推論と長期計画能力の最近の進歩は,ビジネスオペレーションプロセスにおいてますます活用されているエージェントの開発を急進的に進めている。在庫水準を最適化する決定モデルは、運用管理の中核的な要素の1つである。しかし、LCMエージェントが不確実な状況下で在庫決定を行う能力や、エージェントの意思決定バイアス(例えばフレーミング効果など)は、まだ明らかにされていない。このことは、LLMエージェントが現実世界の問題に効果的に対処する能力と、存在する可能性のあるバイアスの潜在的な影響に関する懸念を喚起する。このギャップに対処するために,多種多様な在庫補充実験を通じて,不確実なサプライチェーン管理シナリオにおけるLCMエージェントの意思決定行動を評価するための新しいベンチマークであるAIM-Benchを紹介する。以上の結果から, LLMは人体とよく似た, 決定バイアスの度合いが異なることが明らかとなった。さらに,情報共有の認知的反映と実装という,プル・ツー・セントレ効果とブルウィップ効果を緩和する戦略についても検討した。これらの知見は, 発明意思決定シナリオにおけるLCMの展開における潜在的なバイアスについて, 慎重に検討することの必要性を浮き彫りにした。これらの洞察が、人間の意思決定バイアスを緩和し、サプライチェーンのための人間中心の意思決定支援システムを開発するための道を開くことを願っている。

論文の概要: AIM-Bench: Evaluating Decision-making Biases of Agentic LLM as Inventory Manager

関連論文リスト