Related papers: When Debate Fails: Bias Reinforcement in Large Language Models

When Debate Fails: Bias Reinforcement in Large Language Models

URL: http://arxiv.org/abs/2503.16814v1
Date: Fri, 21 Mar 2025 02:51:30 GMT
Title: When Debate Fails: Bias Reinforcement in Large Language Models
Authors: Jihwan Oh, Minchan Jeong, Jongwoo Ko, Se-Young Yun,
Abstract summary: Large Language Models (LLMs) solve complex problems using training-free methods like prompt engineering and in-context learning.<n>Self-correction methods such as self-consistency and self-refinement aim to improve reliability.<n>We identify two key limitations: bias reinforcement and lack of perspective diversity.
Score: 28.36216398327389
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models $($LLMs$)$ solve complex problems using training-free methods like prompt engineering and in-context learning, yet ensuring reasoning correctness remains challenging. While self-correction methods such as self-consistency and self-refinement aim to improve reliability, they often reinforce biases due to the lack of effective feedback mechanisms. Multi-Agent Debate $($MAD$)$ has emerged as an alternative, but we identify two key limitations: bias reinforcement, where debate amplifies model biases instead of correcting them, and lack of perspective diversity, as all agents share the same model and reasoning patterns, limiting true debate effectiveness. To systematically evaluate these issues, we introduce $\textit{MetaNIM Arena}$, a benchmark designed to assess LLMs in adversarial strategic decision-making, where dynamic interactions influence optimal decisions. To overcome MAD's limitations, we propose $\textbf{DReaMAD}$ $($$\textbf{D}$iverse $\textbf{Rea}$soning via $\textbf{M}$ulti-$\textbf{A}$gent $\textbf{D}$ebate with Refined Prompt$)$, a novel framework that $(1)$ refines LLM's strategic prior knowledge to improve reasoning quality and $(2)$ promotes diverse viewpoints within a single model by systematically modifying prompts, reducing bias. Empirical results show that $\textbf{DReaMAD}$ significantly improves decision accuracy, reasoning diversity, and bias mitigation across multiple strategic tasks, establishing it as a more effective approach for LLM-based decision-making.

Related papers

GM-PRM: A Generative Multimodal Process Reward Model for Multimodal Mathematical Reasoning [12.724393910603299]
We introduce the Generative Multimodal Process Reward Model (GM-PRM)<n>Instead of a simple scalar score, GM-PRM provides a fine-grained, interpretable analysis of each reasoning step.<n>We show that GM-PRM achieves state-of-the-art results on multiple multimodal math benchmarks.
arXiv Detail & Related papers (2025-08-06T05:10:29Z)
Learning Deliberately, Acting Intuitively: Unlocking Test-Time Reasoning in Multimodal LLMs [7.501387372794562]
Deliberate-to-Intuitive reasoning framework (D2I) improves understanding and reasoning ability of multimodal language models.<n>Our method sets deliberate reasoning strategies to enhance modality alignment only through the rule-based format reward during training.<n>While evaluating, the reasoning style shifts to intuitive, which removes deliberate reasoning strategies during training and implicitly reflects the model's acquired abilities in the response.
arXiv Detail & Related papers (2025-07-09T16:25:44Z)
InspireDebate: Multi-Dimensional Subjective-Objective Evaluation-Guided Reasoning and Optimization for Debating [15.096294311783836]
Existing large language models (LLMs) focus on responding to specific arguments while neglecting objective assessments such as authenticity and logical validity.<n>We propose a dual-component framework: $textbfInspireScore$, a novel evaluation system, and $textbfInspireDebate$, an optimized debating framework.<n>$textbfInspireScore$ achieves 44$%$ higher correlation with expert judgments compared to existing methods, while $textbfInspireDebate$ shows significant improvements.
arXiv Detail & Related papers (2025-06-22T17:14:29Z)
Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs [25.067282214293904]
This paper explores whether post-training techniques, specifically Supervised Fine-Tuning (SFT) and Reinforcement Learning with Verifiable Rewards (RLVR), can effectively $textitgeneralize$ to multi-agent scenarios.<n>We use economic reasoning as a testbed, leveraging its strong foundations in mathematics and game theory.<n> Comprehensive evaluation on economic reasoning benchmarks and multi-agent games reveals clear improvements in structured reasoning and economic rationality.
arXiv Detail & Related papers (2025-05-31T14:22:40Z)
One-Stage Top-$k$ Learning-to-Defer: Score-Based Surrogates with Theoretical Guarantees [3.6787328174619254]
We introduce the first one-stage Top-$k$ Learning-to-Defer framework.<n>We learn a shared score-based model that selects the $k$ most cost-effective entities-labels or experts-per input.<n>Experiments on CIFAR-10 and SVHN confirm that our one-stage Top-$k$ method strictly outperforms Top-1 deferral.
arXiv Detail & Related papers (2025-05-15T10:41:16Z)
Adaptive Thinking via Mode Policy Optimization for Social Language Agents [75.3092060637826]
We propose a framework to improve the adaptive thinking ability of language agents in dynamic social interactions.<n>Our framework advances existing research in three key aspects: (1) Multi-granular thinking mode design, (2) Context-aware mode switching across social interaction, and (3) Token-efficient reasoning via depth-adaptive processing.
arXiv Detail & Related papers (2025-05-04T15:39:58Z)
Why Ask One When You Can Ask $k$? Two-Stage Learning-to-Defer to the Top-$k$ Experts [3.6787328174619254]
Learning-to-Defer (L2D) enables decision-making systems to improve reliability by selectively deferring uncertain predictions to more competent agents. We propose Top-$k$ Learning-to-Defer, a generalization of the classical two-stage L2D framework that allocates each query to the $k$ most confident agents instead of a single one. To further enhance flexibility and cost-efficiency, we introduce Top-$k(x)$ Learning-to-Defer, an adaptive extension that learns the optimal number of agents to consult for each query.
arXiv Detail & Related papers (2025-04-17T14:50:40Z)
Supervised Optimism Correction: Be Confident When LLMs Are Sure [91.7459076316849]
We establish a novel theoretical connection between supervised fine-tuning and offline reinforcement learning.<n>We show that the widely used beam search method suffers from unacceptable over-optimism.<n>We propose Supervised Optimism Correction, which introduces a simple yet effective auxiliary loss for token-level $Q$-value estimations.
arXiv Detail & Related papers (2025-04-10T07:50:03Z)
Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering [78.89231943329885]
One of the most widely used tasks to evaluate Large Language Models (LLMs) is Multiple-Choice Question Answering (MCQA) In this work, we shed light on the inconsistencies of MCQA evaluation strategies, which can lead to inaccurate and misleading model comparisons.
arXiv Detail & Related papers (2025-03-19T08:45:03Z)
Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models [25.291029168327874]
Pre-training large language models (LLMs) on vast text corpora enhances natural language processing capabilities but risks encoding social biases, particularly gender bias.<n>We propose $textitFaIRMaker$, an automated and model-independent framework that employs a paradigm to adaptively generate Fairwords.<n>Experiments demonstrate that $textitFaIRMaker$ automatically searches for and dynamically refines Fairwords, effectively mitigating gender bias while preserving task integrity.
arXiv Detail & Related papers (2025-02-17T08:44:04Z)
Autoformulation of Mathematical Optimization Models Using LLMs [50.030647274271516]
We develop an automated approach to creating optimization models from natural language descriptions for commercial solvers. We identify the three core challenges of autoformulation: (1) defining the vast, problem-dependent hypothesis space, (2) efficiently searching this space under uncertainty, and (3) evaluating formulation correctness.
arXiv Detail & Related papers (2024-11-03T20:41:38Z)
FLARE: Faithful Logic-Aided Reasoning and Exploration [50.9814063216852]
We introduce a novel approach for traversing the problem space using task decompositions.<n>We use the Large Language Models to plan a solution, soft-formalise the query into facts and predicates using a logic programming code.<n>Our method allows us to compute the faithfulness of the reasoning process w.r.t. the generated code and analyse the steps of the multi-hop search without relying on external solvers.
arXiv Detail & Related papers (2024-10-14T19:39:11Z)
Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models [33.76903352835436]
Large Vision-Language Models (LVLMs) have demonstrated impressive capabilities for capturing and reasoning over multimodal inputs. These models are prone to parametric knowledge conflicts, which arise from inconsistencies of represented knowledge between their vision and language components. We present a systematic approach to detect, interpret, and mitigate them.
arXiv Detail & Related papers (2024-10-04T17:59:28Z)
Metareasoning in uncertain environments: a meta-BAMDP framework [1.0923877073891441]
Finding the right $P$ can itself be framed as an optimization problem over the space of reasoning processes $P$.<n>This paper proposes a meta Bayes-Adaptive MDP framework to handle metareasoning in environments with unknown reward/transition distributions.
arXiv Detail & Related papers (2024-08-02T13:15:01Z)
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.<n>We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.<n>Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z)
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection [80.63946798650653]
Decision centers on whether to use a large LLM with better performance or a smaller one with reduced costs. We propose a simpler solution; we use only the uncertainty of the generations of the small LLM as the decision criterion. Our experiments reveal this simple solution optimally balances cost and performance, outperforming existing methods on 25 out of 27 experimental setups.
arXiv Detail & Related papers (2024-05-03T14:38:59Z)
Efficient Contextual LLM Cascades through Budget-Constrained Policy Learning [31.972053219549757]
TREACLE is a reinforcement learning policy that jointly selects the model and prompting scheme while respecting the user's monetary cost and latency constraints. Our evaluations show that TREACLE enables cost savings of up to 85% compared to baselines, while maintaining high accuracy.
arXiv Detail & Related papers (2024-04-17T05:56:49Z)
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate [85.3444184685235]
We propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution. Our framework encourages divergent thinking in LLMs which would be helpful for tasks that require deep levels of contemplation.
arXiv Detail & Related papers (2023-05-30T15:25:45Z)
Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy. We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples. Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.