Multi-agent Undercover Gaming: Hallucination Removal via Counterfactual Test for Multimodal Reasoning
- URL: http://arxiv.org/abs/2511.11182v1
- Date: Fri, 14 Nov 2025 11:27:55 GMT
- Title: Multi-agent Undercover Gaming: Hallucination Removal via Counterfactual Test for Multimodal Reasoning
- Authors: Dayong Liang, Xiao-Yong Wei, Changmeng Zheng,
- Abstract summary: Hallucination poses a major obstacle in the reasoning capabilities of large language models.<n>We introduce the Multi-agent Undercover Gaming (MUG) protocol, inspired by social deduction games like "Who is Undercover?"<n>MUG reframes MAD as a process of detecting "undercover" agents (those suffering from hallucinations) by employing multimodal counterfactual tests.
- Score: 12.06050648342985
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hallucination continues to pose a major obstacle in the reasoning capabilities of large language models (LLMs). Although the Multi-Agent Debate (MAD) paradigm offers a promising solution by promoting consensus among multiple agents to enhance reliability, it relies on the unrealistic assumption that all debaters are rational and reflective, which is a condition that may not hold when agents themselves are prone to hallucinations. To address this gap, we introduce the Multi-agent Undercover Gaming (MUG) protocol, inspired by social deduction games like "Who is Undercover?". MUG reframes MAD as a process of detecting "undercover" agents (those suffering from hallucinations) by employing multimodal counterfactual tests. Specifically, we modify reference images to introduce counterfactual evidence and observe whether agents can accurately identify these changes, providing ground-truth for identifying hallucinating agents and enabling robust, crowd-powered multimodal reasoning. MUG advances MAD protocols along three key dimensions: (1) enabling factual verification beyond statistical consensus through counterfactual testing; (2) introducing cross-evidence reasoning via dynamically modified evidence sources instead of relying on static inputs; and (3) fostering active reasoning, where agents engage in probing discussions rather than passively answering questions. Collectively, these innovations offer a more reliable and effective framework for multimodal reasoning in LLMs. The source code can be accessed at https://github.com/YongLD/MUG.git.
Related papers
- AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent [57.10083973844841]
AgentArk is a novel framework to distill multi-agent dynamics into the weights of a single model.<n>We investigate three hierarchical distillation strategies across various models, tasks, scaling, and scenarios.<n>By shifting the burden of computation from inference to training, the distilled models preserve the efficiency of one agent while exhibiting strong reasoning and self-correction performance of multiple agents.
arXiv Detail & Related papers (2026-02-03T19:18:28Z) - DynaDebate: Breaking Homogeneity in Multi-Agent Debate with Dynamic Path Generation [47.62978918069135]
We introduce Dynamic Multi-Agent Debate (DynaDebate), which enhances the effectiveness of multi-agent debate through three key mechanisms.<n>Extensive experiments demonstrate that DynaDebate achieves superior performance across various benchmarks, surpassing existing state-of-the-art MAD methods.
arXiv Detail & Related papers (2026-01-09T12:01:33Z) - Tool-MAD: A Multi-Agent Debate Framework for Fact Verification with Diverse Tool Augmentation and Adaptive Retrieval [10.62333858188658]
Multi-Agent Debate (MAD) systems aim to improve answer accuracy by enabling multiple LLM agents to engage in dialogue.<n>Existing MAD frameworks primarily rely on internal knowledge or static documents, making them vulnerable to hallucinations.<n>We propose Tool-MAD, a multi-agent debate framework that enhances factual verification by assigning each agent a distinct external tool.
arXiv Detail & Related papers (2026-01-08T09:07:41Z) - InEx: Hallucination Mitigation via Introspection and Cross-Modal Multi-Agent Collaboration [6.103123418191468]
InEx is a training-free, multi-agent framework designed to autonomously mitigate hallucination.<n>InEx consistently outperforms existing methods, achieving 4%-27% gains on general and hallucination benchmarks.
arXiv Detail & Related papers (2025-12-02T17:59:52Z) - Towards Scalable Oversight with Collaborative Multi-Agent Debate in Error Detection [81.52796950244705]
Self-diagnosis is unreliable on complex tasks unless aided by reliable external feedback.<n>We introduce a new collaborative MAD protocol, termed ColMAD, that reframes MAD as a non-zero sum game.<n>We show that ColMAD significantly outperforms previous competitive MAD by 19%.
arXiv Detail & Related papers (2025-10-23T19:46:00Z) - LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions [80.12078194093013]
We present the first comprehensive survey of hallucinations in LLM-based agents.<n>We propose a new taxonomy that identifies different types of agent hallucinations occurring at different stages.<n>We conduct an in-depth examination of eighteen triggering causes underlying the emergence of agent hallucinations.
arXiv Detail & Related papers (2025-09-23T13:24:48Z) - Can an Individual Manipulate the Collective Decisions of Multi-Agents? [53.01767232004823]
M-Spoiler is a framework that simulates agent interactions within a multi-agent system to generate adversarial samples.<n>M-Spoiler introduces a stubborn agent that actively aids in optimizing adversarial samples.<n>Our findings confirm the risks posed by the knowledge of an individual agent in multi-agent systems.
arXiv Detail & Related papers (2025-09-20T01:54:20Z) - The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind [8.341160422849969]
Decrypto is a game-based benchmark for multi-agent reasoning and ToM.<n>It is the first platform for designing interactive ToM experiments.<n>We find that LLM game-playing abilities lag behind humans and simple word-embedding baselines.
arXiv Detail & Related papers (2025-06-25T17:55:27Z) - Mitigating Manipulation and Enhancing Persuasion: A Reflective Multi-Agent Approach for Legal Argument Generation [3.99322081587874]
Large Language Models (LLMs) are increasingly explored for legal argument generation.<n>They pose significant risks of manipulation through hallucination and ungrounded persuasion.<n>This paper introduces a novel reflective multi-agent method designed to address these challenges.
arXiv Detail & Related papers (2025-06-03T15:28:30Z) - Breaking Event Rumor Detection via Stance-Separated Multi-Agent Debate [21.342632695285364]
Leveraging large language models (LLMs) for rumor detection holds significant promise.<n>We propose the Stance Separated Multi-Agent Debate (S2MAD) to address this issue.<n>Our proposed model outperforms state-of-the-art methods in terms of performance.
arXiv Detail & Related papers (2024-12-06T08:52:30Z) - MAD-Sherlock: Multi-Agent Debate for Visual Misinformation Detection [36.12673167913763]
We introduce MAD-Sherlock, a multi-agent debate system for out-of-context misinformation detection.<n> MAD-Sherlock frames detection as a multi-agent debate, reflecting the diverse and conflicting discourse found online.<n>Our framework is domain- and time-agnostic, requiring no finetuning, yet achieves state-of-the-art accuracy with in-depth explanations.
arXiv Detail & Related papers (2024-10-26T10:34:22Z) - Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate [85.3444184685235]
We propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution.
Our framework encourages divergent thinking in LLMs which would be helpful for tasks that require deep levels of contemplation.
arXiv Detail & Related papers (2023-05-30T15:25:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.