Related papers: More Than Irrational: Modeling Belief-Biased Agents

More Than Irrational: Modeling Belief-Biased Agents

URL: http://arxiv.org/abs/2511.12359v1
Date: Sat, 15 Nov 2025 21:14:37 GMT
Title: More Than Irrational: Modeling Belief-Biased Agents
Authors: Yifan Zhu, Sammie Katt, Samuel Kaski,
Abstract summary: We introduce a class of computational-rational (CR) user models for cognitively-bounded agents acting optimally under biased beliefs.<n>We address the challenge of identifying the latent user-specific bound and inferring biased belief states from passive observations.<n>We show that our CR model generates intuitively plausible behaviors corresponding to different levels of memory capacity.
Score: 25.274115351731325
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the explosive growth of AI and the technologies built upon it, predicting and inferring the sub-optimal behavior of users or human collaborators remains a critical challenge. In many cases, such behaviors are not a result of irrationality, but rather a rational decision made given inherent cognitive bounds and biased beliefs about the world. In this paper, we formally introduce a class of computational-rational (CR) user models for cognitively-bounded agents acting optimally under biased beliefs. The key novelty lies in explicitly modeling how a bounded memory process leads to a dynamically inconsistent and biased belief state and, consequently, sub-optimal sequential decision-making. We address the challenge of identifying the latent user-specific bound and inferring biased belief states from passive observations on the fly. We argue that for our formalized CR model family with an explicit and parameterized cognitive process, this challenge is tractable. To support our claim, we propose an efficient online inference method based on nested particle filtering that simultaneously tracks the user's latent belief state and estimates the unknown cognitive bound from a stream of observed actions. We validate our approach in a representative navigation task using memory decay as an example of a cognitive bound. With simulations, we show that (1) our CR model generates intuitively plausible behaviors corresponding to different levels of memory capacity, and (2) our inference method accurately and efficiently recovers the ground-truth cognitive bounds from limited observations ($\le 100$ steps). We further demonstrate how this approach provides a principled foundation for developing adaptive AI assistants, enabling adaptive assistance that accounts for the user's memory limitations.

Related papers

From Passive Metric to Active Signal: The Evolving Role of Uncertainty Quantification in Large Language Models [77.04403907729738]
This survey charts the evolution of uncertainty from a passive diagnostic metric to an active control signal guiding real-time model behavior.<n>We demonstrate how uncertainty is leveraged as an active control signal across three frontiers.<n>This survey argues that mastering the new trend of uncertainty is essential for building the next generation of scalable, reliable, and trustworthy AI.
arXiv Detail & Related papers (2026-01-22T06:21:31Z)
Forgetting as a Feature: Cognitive Alignment of Large Language Models [39.146761527401424]
We show that Large Language Models (LLMs) exhibit systematic forgetting of past information.<n> Drawing inspiration from human memory dynamics, we model LLM inference as a probabilistic memory process governed by exponential decay.<n>Building on these observations, we propose probabilistic memory prompting, a lightweight strategy that shapes evidence integration to mimic human-like memory decay.
arXiv Detail & Related papers (2025-12-28T10:43:00Z)
What Does Your Benchmark Really Measure? A Framework for Robust Inference of AI Capabilities [0.773472615056109]
Evaluations of generative models on benchmark data are now ubiquitous.<n>Yet growing skepticism surrounds their reliability.<n>How can we know that a reported accuracy genuinely reflects a model's true performance?<n>We make this step explicit by proposing a principled framework for evaluation as inference.
arXiv Detail & Related papers (2025-09-23T21:29:04Z)
STARec: An Efficient Agent Framework for Recommender Systems via Autonomous Deliberate Reasoning [54.28691219536054]
We introduce STARec, a slow-thinking augmented agent framework that endows recommender systems with autonomous deliberative reasoning capabilities.<n>We develop anchored reinforcement training - a two-stage paradigm combining structured knowledge distillation from advanced reasoning models with preference-aligned reward shaping.<n>Experiments on MovieLens 1M and Amazon CDs benchmarks demonstrate that STARec achieves substantial performance gains compared with state-of-the-art baselines.
arXiv Detail & Related papers (2025-08-26T08:47:58Z)
Query-Level Uncertainty in Large Language Models [39.59641844929696]
We propose a method to detect knowledge boundaries via Query-Level Uncertainty.<n>This method estimates if a model is capable of answering a given query before generating any tokens, thus avoiding the generation cost.<n>We demonstrate its benefits in adaptive inference settings, showing that for RAG and model cascading it reduces inference costs while preserving overall performance.
arXiv Detail & Related papers (2025-06-11T12:39:48Z)
Dynamic Programming Techniques for Enhancing Cognitive Representation in Knowledge Tracing [125.75923987618977]
We propose the Cognitive Representation Dynamic Programming based Knowledge Tracing (CRDP-KT) model.<n>It is a dynamic programming algorithm to optimize cognitive representations based on the difficulty of the questions and the performance intervals between them.<n>It provides more accurate and systematic input features for subsequent model training, thereby minimizing distortion in the simulation of cognitive states.
arXiv Detail & Related papers (2025-06-03T14:44:48Z)
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training [86.70255651945602]
We introduce a novel inference-time steering methodology called Reinforcing Cognitive Experts (RICE)<n>RICE aims to improve reasoning performance without additional training or complexs.<n> Empirical evaluations with leading MoE-based LRMs demonstrate noticeable and consistent improvements in reasoning accuracy, cognitive efficiency, and cross-domain generalization.
arXiv Detail & Related papers (2025-05-20T17:59:16Z)
Inverse decision-making using neural amortized Bayesian actors [19.128377007314317]
We amortize the Bayesian actor using a neural network trained on a wide range of parameter settings in an unsupervised fashion.<n>We show how our method allows for principled model comparison and how it can be used to disentangle factors that may lead to unidentifiabilities between priors and costs.
arXiv Detail & Related papers (2024-09-04T10:31:35Z)
Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous Unobserved Confounders [9.401989343015364]
We study robust policy evaluation and policy optimization in the presence of sequentially-exogenous unobserved confounders.<n>We provide sample complexity bounds, insights, and show effectiveness both in simulations and on real-world longitudinal healthcare data of treating sepsis.
arXiv Detail & Related papers (2023-02-01T18:40:53Z)
Modeling human road crossing decisions as reward maximization with visual perception limitations [23.561752465516047]
We develop a model of human pedestrian crossing decisions based on computational rationality. We show that the proposed cognitive-RL model captures human-like patterns of gap acceptance and crossing initiation time. Our results suggest that this is instead a rational adaption to human perceptual limitations.
arXiv Detail & Related papers (2023-01-27T14:20:35Z)
Neural Causal Models for Counterfactual Identification and Estimation [62.30444687707919]
We study the evaluation of counterfactual statements through neural models. First, we show that neural causal models (NCMs) are expressive enough. Second, we develop an algorithm for simultaneously identifying and estimating counterfactual distributions.
arXiv Detail & Related papers (2022-09-30T18:29:09Z)
Reinforcement Learning with a Terminator [80.34572413850186]
We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds. We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret.
arXiv Detail & Related papers (2022-05-30T18:40:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.