Related papers: Balancing Rigor and Utility: Mitigating Cognitive Biases in Large Language Models for Multiple-Choice Questions

Balancing Rigor and Utility: Mitigating Cognitive Biases in Large Language Models for Multiple-Choice Questions

URL: http://arxiv.org/abs/2406.10999v3
Date: Mon, 9 Sep 2024 16:28:09 GMT
Title: Balancing Rigor and Utility: Mitigating Cognitive Biases in Large Language Models for Multiple-Choice Questions
Authors: Liman Wang, Hanyang Zhong, Wenting Cao, Zeyuan Sun,
Abstract summary: This paper examines the role of cognitive biases in the decision-making processes of large language models (LLMs) We show that certain cognitive biases when properly balanced, can enhance decision-making efficiency through rational deviations and shortcuts.
Score: 0.46873264197900916
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper examines the role of cognitive biases in the decision-making processes of large language models (LLMs), challenging the conventional goal of eliminating all biases. We show that certain cognitive biases when properly balanced, can enhance decision-making efficiency through rational deviations and heuristic shortcuts. By introducing heuristic moderation and an abstention option, which allows LLMs to withhold responses when uncertain, we reduce error rates, improve decision accuracy, and optimize decision rates. Using the Balance Rigor and Utility (BRU) dataset, developed through expert collaboration, our findings demonstrate that targeted inspection of cognitive biases aligns LLM decisions more closely with human reasoning, enhancing reliability and suggesting strategies for future improvements. This approach offers a novel way to leverage cognitive biases to improve the practical utility of LLMs across various applications.

Related papers

Cognitive Debiasing Large Language Models for Decision-Making [71.2409973056137]
Large language models (LLMs) have shown potential in supporting decision-making applications. We propose a cognitive debiasing approach, called self-debiasing, that enhances the reliability of LLMs. Our method follows three sequential steps -- bias determination, bias analysis, and cognitive debiasing -- to iteratively mitigate potential cognitive biases in prompts.
arXiv Detail & Related papers (2025-04-05T11:23:05Z)
Investigating the Impact of LLM Personality on Cognitive Bias Manifestation in Automated Decision-Making Tasks [4.65004369765875]
Personality traits play a crucial role in either amplifying or reducing biases. Conscientiousness and Agreeableness may generally enhance the efficacy of bias mitigation strategies.
arXiv Detail & Related papers (2025-02-20T03:15:54Z)
Towards Objective and Unbiased Decision Assessments with LLM-Enhanced Hierarchical Attention Networks [6.520709313101523]
This work investigates cognitive bias identification in high-stake decision making process by human experts. We propose bias-aware AI-augmented workflow that surpass human judgment. In our experiments, both the proposed model and the agentic workflow significantly improves on both human judgment and alternative models.
arXiv Detail & Related papers (2024-11-13T10:42:11Z)
Cognitive Biases in Large Language Models for News Recommendation [68.90354828533535]
This paper explores the potential impact of cognitive biases on large language models (LLMs) based news recommender systems. We discuss strategies to mitigate these biases through data augmentation, prompt engineering and learning algorithms aspects.
arXiv Detail & Related papers (2024-10-03T18:42:07Z)
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making. Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations. Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z)
KnowPO: Knowledge-aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models [14.057527352653787]
We propose a Knowledge-aware Preference Optimization strategy, dubbed KnowPO, aimed at achieving adaptive knowledge selection. We show that KnowPO outperforms previous methods for handling knowledge conflicts by over 37%.
arXiv Detail & Related papers (2024-08-06T16:55:54Z)
Predicting and Understanding Human Action Decisions: Insights from Large Language Models and Cognitive Instance-Based Learning [0.0]
Large Language Models (LLMs) have demonstrated their capabilities across various tasks. This paper exploits the reasoning and generative capabilities of the LLMs to predict human behavior in two sequential decision-making tasks. We compare the performance of LLMs with a cognitive instance-based learning model, which imitates human experiential decision-making.
arXiv Detail & Related papers (2024-07-12T14:13:06Z)
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making. We present a process-based benchmark MR-Ben that demands a meta-reasoning skill. Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z)
Prompting Fairness: Integrating Causality to Debias Large Language Models [19.76215433424235]
Large language models (LLMs) are susceptible to generating biased and discriminatory responses. We propose a causality-guided debiasing framework to tackle social biases.
arXiv Detail & Related papers (2024-03-13T17:46:28Z)
DeLLMa: Decision Making Under Uncertainty with Large Language Models [31.77731889916652]
DeLLMa is a framework designed to enhance decision-making accuracy in uncertain environments. We show that DeLLMa can consistently enhance the decision-making performance of leading language models, and achieve up to a 40% increase in accuracy over competing methods.
arXiv Detail & Related papers (2024-02-04T08:11:45Z)
From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning [66.98861219674039]
Heuristic-Analytic Reasoning (HAR) strategies drastically improve the coherence of rationalizations for model decisions. Our findings suggest that human-like reasoning strategies can effectively improve the coherence and reliability of PLM reasoning.
arXiv Detail & Related papers (2023-10-24T19:46:04Z)
Rational Decision-Making Agent with Internalized Utility Judgment [91.80700126895927]
Large language models (LLMs) have demonstrated remarkable advancements and have attracted significant efforts to develop LLMs into agents capable of executing intricate multi-step decision-making tasks beyond traditional NLP applications. This paper proposes RadAgent, which fosters the development of its rationality through an iterative framework involving Experience Exploration and Utility Learning. Experimental results on the ToolBench dataset demonstrate RadAgent's superiority over baselines, achieving over 10% improvement in Pass Rate on diverse tasks.
arXiv Detail & Related papers (2023-08-24T03:11:45Z)
Explainability's Gain is Optimality's Loss? -- How Explanations Bias Decision-making [0.0]
Explanations help to facilitate communication between the algorithm and the human decision-maker. Feature-based explanations' semantics of causal models induce leakage from the decision-maker's prior beliefs. Such differences can lead to sub-optimal and biased decision outcomes.
arXiv Detail & Related papers (2022-06-17T11:43:42Z)
Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap. We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert. Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.