Large Language Newsvendor: Decision Biases and Cognitive Mechanisms
- URL: http://arxiv.org/abs/2512.12552v1
- Date: Sun, 14 Dec 2025 04:51:53 GMT
- Title: Large Language Newsvendor: Decision Biases and Cognitive Mechanisms
- Authors: Jifei Liu, Zhi Chen, Yuanguang Zhong,
- Abstract summary: Large language models (LLMs) are increasingly integrated into business decision making.<n>LLMs replicate and amplify human cognitive biases.<n>This is particularly critical in high-stakes operational contexts like supply chain management.
- Score: 2.7070404673380817
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Problem definition: Although large language models (LLMs) are increasingly integrated into business decision making, their potential to replicate and even amplify human cognitive biases cautions a significant, yet not well-understood, risk. This is particularly critical in high-stakes operational contexts like supply chain management. To address this, we investigate the decision-making patterns of leading LLMs using the canonical newsvendor problem in a dynamic setting, aiming to identify the nature and origins of their cognitive biases. Methodology/results: Through dynamic, multi-round experiments with GPT-4, GPT-4o, and LLaMA-8B, we tested for five established decision biases. We found that LLMs consistently replicated the classic ``Too Low/Too High'' ordering bias and significantly amplified other tendencies like demand-chasing behavior compared to human benchmarks. Our analysis uncovered a ``paradox of intelligence'': the more sophisticated GPT-4 demonstrated the greatest irrationality through overthinking, while the efficiency-optimized GPT-4o performed near-optimally. Because these biases persist even when optimal formulas are provided, we conclude they stem from architectural constraints rather than knowledge gaps. Managerial implications: First, managers should select models based on the specific task, as our results show that efficiency-optimized models can outperform more complex ones on certain optimization problems. Second, the significant amplification of bias by LLMs highlights the urgent need for robust human-in-the-loop oversight in high-stakes decisions to prevent costly errors. Third, our findings suggest that designing structured, rule-based prompts is a practical and effective strategy for managers to constrain models' heuristic tendencies and improve the reliability of AI-assisted decisions.
Related papers
- Ask, Clarify, Optimize: Human-LLM Agent Collaboration for Smarter Inventory Control [11.796330722859574]
We show that employing LLMs as end-to-end solvers incurs a significant "hallucination tax"<n>We propose a hybrid agentic framework that strictly decouples semantic reasoning from mathematical calculation.<n>Our results position LLMs as natural-language interfaces that make rigorous, solver-based policies accessible to non-experts.
arXiv Detail & Related papers (2025-12-31T21:45:54Z) - ORPR: An OR-Guided Pretrain-then-Reinforce Learning Model for Inventory Management [9.138155308817215]
"Pretrain-then-Reinforce" approach reconciles AI's adaptive perception with Operations Research's structural rigor.<n>We show that a lightweight, domain-informed model can deliver state-of-the-art performance and robust transferability when guided by structured OR logic.
arXiv Detail & Related papers (2025-12-22T03:39:43Z) - Demystifying Reinforcement Learning in Agentic Reasoning [90.3737088727791]
We conduct a comprehensive and systematic investigation to demystify reinforcement learning in agentic reasoning.<n>We highlight our key insights: (i) replacing stitched synthetic trajectories with real end-to-end tool-use trajectories yields a far stronger SFT.<n> Exploration-friendly techniques are crucial for agentic RL, such as clip higher, overlong reward shaping, and maintaining adequate policy entropy could improve the training efficiency.
arXiv Detail & Related papers (2025-10-13T17:57:15Z) - TCPO: Thought-Centric Preference Optimization for Effective Embodied Decision-making [75.29820290660065]
This paper proposes Thought-Centric Preference Optimization ( TCPO) for effective embodied decision-making.<n>It emphasizes the alignment of the model's intermediate reasoning process, mitigating the problem of model degradation.<n>Experiments in the ALFWorld environment demonstrate an average success rate of 26.67%, achieving a 6% improvement over RL4VLM.
arXiv Detail & Related papers (2025-09-10T11:16:21Z) - Curse of Knowledge: When Complex Evaluation Context Benefits yet Biases LLM Judges [72.3356133063925]
The paradigm of large language models (LLMs) as judges has emerged as a scalable solution, yet prior work primarily focuses on simple settings.<n>Our in-depth analysis offers crucial insights for improving the accuracy and verifiability of evaluation signals.
arXiv Detail & Related papers (2025-09-03T15:48:33Z) - Automated Optimization Modeling through Expert-Guided Large Language Model Reasoning [43.63419208391747]
We present a novel framework that leverages expert-level optimization modeling principles through chain-of-thought reasoning to automate the optimization process.<n>We also introduce LogiOR, a new optimization modeling benchmark from the logistics domain, containing more complex problems with standardized annotations.
arXiv Detail & Related papers (2025-08-20T04:14:54Z) - Hierarchical Budget Policy Optimization for Adaptive Reasoning [49.621779447691665]
We present Hierarchical Budget Policy Optimization (HBPO), a reinforcement learning framework that enables models to learn problem-specific reasoning depths without sacrificing capability.<n>HBPO partitions the exploration space into budget-constrained hierarchies (512-2560 tokens), each with differentiated reward structures that preserve both efficiency incentives and reasoning capabilities.<n>Extensive experiments demonstrate that HBPO reduces average token usage by up to 60.6% while improving accuracy by 3.14% across four reasoning benchmarks.
arXiv Detail & Related papers (2025-07-21T17:52:34Z) - Reasoning Meets Personalization: Unleashing the Potential of Large Reasoning Model for Personalized Generation [21.89080753903469]
We present the first systematic evaluation of large reasoning models (LRMs) for personalization tasks.<n>Our analysis identifies three key limitations: divergent thinking, misalignment of response formats, and ineffective use of retrieved information.<n>We propose Reinforced Reasoning for Personalization (model), a novel framework that incorporates a hierarchical reasoning thought template to guide LRMs in generating structured outputs.
arXiv Detail & Related papers (2025-05-23T07:30:13Z) - Self-Adaptive Cognitive Debiasing for Large Language Models in Decision-Making [71.71796367760112]
Large language models (LLMs) have shown potential in supporting decision-making applications.<n>We propose a cognitive debiasing approach, self-adaptive cognitive debiasing (SACD)<n>We evaluate SACD on finance, healthcare, and legal decision-making tasks using both open-weight and closed-weight LLMs.
arXiv Detail & Related papers (2025-04-05T11:23:05Z) - MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.<n>We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.<n>Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - Balancing Rigor and Utility: Mitigating Cognitive Biases in Large Language Models for Multiple-Choice Questions [0.46873264197900916]
We show that certain cognitive biases can enhance decision-making efficiency through rational deviations and shortcuts.<n>By introducing moderation and an abstention option, we reduce error rates, improve decision accuracy, and optimize decision rates.<n>This approach offers a novel way to leverage cognitive biases to improve the practical utility of large language models.
arXiv Detail & Related papers (2024-06-16T16:25:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.