Related papers: Heuristics and Biases in AI Decision-Making: Implications for Responsible AGI

Related papers

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing [90.65399476233495]
We introduce RISEBench, the first benchmark for evaluating Reasoning-Informed viSual Editing (RISE) RISEBench focuses on four key reasoning types: Temporal, Causal, Spatial, and Logical Reasoning. We propose an evaluation framework that assesses Instruction Reasoning, Appearance Consistency, and Visual Plausibility with both human judges and an LMM-as-a-judge approach.
arXiv Detail & Related papers (2025-04-03T17:59:56Z)
Fine-Grained Bias Detection in LLM: Enhancing detection mechanisms for nuanced biases [0.0]
This study presents a detection framework to identify nuanced biases in Large Language Models (LLMs) The approach integrates contextual analysis, interpretability via attention mechanisms, and counterfactual data augmentation to capture hidden biases. Results show improvements in detecting subtle biases compared to conventional methods.
arXiv Detail & Related papers (2025-03-08T04:43:01Z)
Large Language Model Strategic Reasoning Evaluation through Behavioral Game Theory [5.361970694197912]
We introduce an evaluation framework grounded in behavioral game theory, disentangling reasoning capability from contextual effects. Testing 22 state-of-the-art LLMs, we find that GPT-o3-mini, GPT-o1, and DeepSeek-R1 dominate most games yet also demonstrate that the model scale alone does not determine performance. In terms of prompting enhancement, Chain-of-Thought (CoT) prompting is not universally effective, as it increases strategic reasoning only for models at certain levels while providing limited gains elsewhere.
arXiv Detail & Related papers (2025-02-27T18:58:31Z)
Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage. Models may behave unreliably due to poorly explored failure modes. causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z)
A Looming Replication Crisis in Evaluating Behavior in Language Models? Evidence and Solutions [15.350973327319418]
Large language models (LLMs) are increasingly integrated into a wide range of everyday applications. This raises concerns about the replicability and generalizability of insights gained from research on LLM behavior. We tested GPT-3.5, GPT-4o, Gemini 1.5 Pro, Claude 3 Opus, Llama 3-8B, and Llama 3-70B, on the chain-of-thought, EmotionPrompting, ExpertPrompting, Sandbagging, as well as Re-Reading prompt engineering techniques.
arXiv Detail & Related papers (2024-09-30T14:00:34Z)
Generative AI for Requirements Engineering: A Systematic Literature Review [4.444308664613162]
The emergence of generative AI (GenAI) offers new opportunities and challenges in requirements engineering (RE) This systematic literature review aims to analyze and synthesize current research on GenAI applications in RE.
arXiv Detail & Related papers (2024-09-10T02:44:39Z)
An Empirical Analysis on Large Language Models in Debate Evaluation [10.677407097411768]
We investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3.5 and GPT-4 in the context of debate evaluation. We uncover a consistent bias in both GPT-3.5 and GPT-4 towards the second candidate response presented. We also uncover lexical biases in both GPT-3.5 and GPT-4, especially when label sets carry connotations such as numerical or sequential.
arXiv Detail & Related papers (2024-05-28T18:34:53Z)
Incoherent Probability Judgments in Large Language Models [5.088721610298991]
We assess the coherence of probability judgments made by autoregressive Large Language Models (LLMs) Our results show that the judgments produced by these models are often incoherent, displaying human-like systematic deviations from the rules of probability theory.
arXiv Detail & Related papers (2024-01-30T00:40:49Z)
From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning [66.98861219674039]
Heuristic-Analytic Reasoning (HAR) strategies drastically improve the coherence of rationalizations for model decisions. Our findings suggest that human-like reasoning strategies can effectively improve the coherence and reliability of PLM reasoning.
arXiv Detail & Related papers (2023-10-24T19:46:04Z)
Towards Understanding Sycophancy in Language Models [49.99654432561934]
We investigate the prevalence of sycophancy in models whose finetuning procedure made use of human feedback. We show that five state-of-the-art AI assistants consistently exhibit sycophancy across four varied free-form text-generation tasks. Our results indicate that sycophancy is a general behavior of state-of-the-art AI assistants, likely driven in part by human preference judgments favoring sycophantic responses.
arXiv Detail & Related papers (2023-10-20T14:46:48Z)
Use of probabilistic phrases in a coordination game: human versus GPT-4 [0.0]
English speakers use probabilistic phrases such as likely to communicate information about the probability or likelihood of events. We first assessed human ability to estimate the probability and the ambiguity of 23 probabilistic phrases in a coordination game. We found that the median human participant and GPT4 assigned probability estimates that were in good agreement.
arXiv Detail & Related papers (2023-10-16T16:14:27Z)
The Impact of Explanations on Fairness in Human-AI Decision-Making: Protected vs Proxy Features [25.752072910748716]
Explanations may help human-AI teams address biases for fairer decision-making. We study the effect of the presence of protected and proxy features on participants' perception of model fairness. We find that explanations help people detect direct but not indirect biases.
arXiv Detail & Related papers (2023-10-12T16:00:16Z)
Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias [57.42417061979399]
Recent studies show that instruction tuning (IT) and reinforcement learning from human feedback (RLHF) improve the abilities of large language models (LMs) dramatically. In this work, we investigate the effect of IT and RLHF on decision making and reasoning in LMs. Our findings highlight the presence of these biases in various models from the GPT-3, Mistral, and T5 families.
arXiv Detail & Related papers (2023-08-01T01:39:25Z)
Consistency Analysis of ChatGPT [65.268245109828]
This paper investigates the trustworthiness of ChatGPT and GPT-4 regarding logically consistent behaviour. Our findings suggest that while both models appear to show an enhanced language understanding and reasoning ability, they still frequently fall short of generating logically consistent predictions.
arXiv Detail & Related papers (2023-03-11T01:19:01Z)
How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks [65.7949334650854]
GPT-3.5 models have demonstrated impressive performance in various Natural Language Processing (NLP) tasks. However, their robustness and abilities to handle various complexities of the open world have yet to be explored. We show that GPT-3.5 faces some specific robustness challenges, including instability, prompt sensitivity, and number sensitivity.
arXiv Detail & Related papers (2023-03-01T07:39:01Z)
Uncertain Evidence in Probabilistic Models and Stochastic Simulators [80.40110074847527]
We consider the problem of performing Bayesian inference in probabilistic models where observations are accompanied by uncertainty, referred to as uncertain evidence' We explore how to interpret uncertain evidence, and by extension the importance of proper interpretation as it pertains to inference about latent variables. We devise concrete guidelines on how to account for uncertain evidence and we provide new insights, particularly regarding consistency.
arXiv Detail & Related papers (2022-10-21T20:32:59Z)
Reconciling Individual Probability Forecasts [78.0074061846588]
We show that two parties who agree on the data cannot disagree on how to model individual probabilities. We conclude that although individual probabilities are unknowable, they are contestable via a computationally and data efficient process.
arXiv Detail & Related papers (2022-09-04T20:20:35Z)
Naturalistic Causal Probing for Morpho-Syntax [76.83735391276547]
We suggest a naturalistic strategy for input-level intervention on real world data in Spanish. Using our approach, we isolate morpho-syntactic features from counfounders in sentences. We apply this methodology to analyze causal effects of gender and number on contextualized representations extracted from pre-trained models.
arXiv Detail & Related papers (2022-05-14T11:47:58Z)
Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards Individualized and Explainable Robotic Support in Everyday Activities [80.37857025201036]
Key challenge for robotic systems is to figure out the behavior of another agent. Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally. We propose equipping robots with the necessary tools to conduct observational studies on people.
arXiv Detail & Related papers (2022-01-27T22:15:56Z)
Correct block-design experiments mitigate temporal correlation bias in EEG classification [68.85562949901077]
We show that the main claim in [1] is drastically overstated and their other analyses are seriously flawed by wrong methodological choices. We investigate the influence of EEG temporal correlation on classification accuracy by testing the same models in two additional experimental settings.
arXiv Detail & Related papers (2020-11-25T22:25:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.