Related papers: Bias or Optimality? Disentangling Bayesian Inference and Learning Biases in Human Decision-Making

Bias or Optimality? Disentangling Bayesian Inference and Learning Biases in Human Decision-Making

URL: http://arxiv.org/abs/2505.08049v1
Date: Mon, 12 May 2025 20:36:43 GMT
Title: Bias or Optimality? Disentangling Bayesian Inference and Learning Biases in Human Decision-Making
Authors: Prakhar Godara,
Abstract summary: We find that even if an agent updates its belief via objective Bayesian inference, fitting the standard Q-learning model with asymmetric learning rates still recovers both biases.<n>We explain this by analyzing the dynamics of these learning systems using master equations.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent studies claim that human behavior in a two-armed Bernoulli bandit (TABB) task is described by positivity and confirmation biases, implying that humans do not integrate new information objectively. However, we find that even if the agent updates its belief via objective Bayesian inference, fitting the standard Q-learning model with asymmetric learning rates still recovers both biases. Bayesian inference cast as an effective Q-learning algorithm has symmetric, though decreasing, learning rates. We explain this by analyzing the stochastic dynamics of these learning systems using master equations. We find that both confirmation bias and unbiased but decreasing learning rates yield the same behavioral signatures. Finally, we propose experimental protocols to disentangle true cognitive biases from artifacts of decreasing learning rates.

Related papers

Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs [51.00909549291524]
Large language models (LLMs) exhibit cognitive biases.<n>These biases vary across models and can be amplified by instruction tuning.<n>It remains unclear if these differences in biases stem from pretraining, finetuning, or even random noise.
arXiv Detail & Related papers (2025-07-09T18:01:14Z)
CM-DQN: A Value-Based Deep Reinforcement Learning Model to Simulate Confirmation Bias [0.0]
We propose a new algorithm in Deep Reinforcement Learning, CM-DQN, to simulate the human decision-making process. We test in Lunar Lander environment with confirmatory, disconfirmatory bias and non-biased to observe the learning effects.
arXiv Detail & Related papers (2024-07-10T08:16:13Z)
DebCSE: Rethinking Unsupervised Contrastive Sentence Embedding Learning in the Debiasing Perspective [1.351603931922027]
We argue that effectively eliminating the influence of various biases is crucial for learning high-quality sentence embeddings. We propose a novel contrastive framework for sentence embedding, termed DebCSE, which can eliminate the impact of these biases.
arXiv Detail & Related papers (2023-09-14T02:43:34Z)
Cross Pairwise Ranking for Unbiased Item Recommendation [57.71258289870123]
We develop a new learning paradigm named Cross Pairwise Ranking (CPR) CPR achieves unbiased recommendation without knowing the exposure mechanism. We prove in theory that this way offsets the influence of user/item propensity on the learning.
arXiv Detail & Related papers (2022-04-26T09:20:27Z)
Agree to Disagree: Diversity through Disagreement for Better Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data. We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z)
On the Estimation Bias in Double Q-Learning [20.856485777692594]
Double Q-learning is not fully unbiased and suffers from underestimation bias. We show that such underestimation bias may lead to multiple non-optimal fixed points under an approximated Bellman operator. We propose a simple but effective approach as a partial fix for the underestimation bias in double Q-learning.
arXiv Detail & Related papers (2021-09-29T13:41:24Z)
Fairness-aware Class Imbalanced Learning [57.45784950421179]
We evaluate long-tail learning methods for tweet sentiment and occupation classification. We extend a margin-loss based approach with methods to enforce fairness.
arXiv Detail & Related papers (2021-09-21T22:16:30Z)
Bayesian analysis of the prevalence bias: learning and predicting from imbalanced data [10.659348599372944]
This paper lays the theoretical and computational framework for training models, and for prediction, in the presence of prevalence bias. It offers an alternative to principled training losses and complements test-time procedures based on selecting an operating point from summary curves. It integrates seamlessly in the current paradigm of (deep) learning using backpropagation and naturally with Bayesian models.
arXiv Detail & Related papers (2021-07-31T14:36:33Z)
Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system. Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model. We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z)
Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions. We make robust and efficient counterfactual predictions for both individual and average treatment effects. The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z)
A framework for predicting, interpreting, and improving Learning Outcomes [0.0]
We develop an Embibe Score Quotient model (ESQ) to predict test scores based on observed academic, behavioral and test-taking features of a student. ESQ can be used to predict the future scoring potential of a student as well as offer personalized learning nudges.
arXiv Detail & Related papers (2020-10-06T11:22:27Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.