Related papers: Belief-Calibrated Multi-Agent Consensus Seeking for Complex NLP Tasks

Belief-Calibrated Multi-Agent Consensus Seeking for Complex NLP Tasks

URL: http://arxiv.org/abs/2510.06307v1
Date: Tue, 07 Oct 2025 17:53:34 GMT
Title: Belief-Calibrated Multi-Agent Consensus Seeking for Complex NLP Tasks
Authors: Wentao Deng, Jiahuan Pei, Zhiwei Xu, Zhaochun Ren, Zhumin Chen, Pengjie Ren,
Abstract summary: We provide a theoretical framework for selecting optimal collaborators that maximize consensus stability.<n>Based on the theorems, we propose the Belief-Calibrated Consensus Seeking (BCCS) framework to facilitate stable consensus.<n> Experimental results on the MATH and MMLU benchmark datasets demonstrate that the proposed BCCS framework outperforms the best existing results.
Score: 45.14284473132228
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A multi-agent system (MAS) enhances its capacity to solve complex natural language processing (NLP) tasks through collaboration among multiple agents, where consensus-seeking serves as a fundamental mechanism. However, existing consensus-seeking approaches typically rely on voting mechanisms to judge consensus, overlooking contradictions in system-internal beliefs that destabilize the consensus. Moreover, these methods often involve agents updating their results through indiscriminate collaboration with every other agent. Such uniform interaction fails to identify the optimal collaborators for each agent, hindering the emergence of a stable consensus. To address these challenges, we provide a theoretical framework for selecting optimal collaborators that maximize consensus stability. Based on the theorems, we propose the Belief-Calibrated Consensus Seeking (BCCS) framework to facilitate stable consensus via selecting optimal collaborators and calibrating the consensus judgment by system-internal beliefs. Experimental results on the MATH and MMLU benchmark datasets demonstrate that the proposed BCCS framework outperforms the best existing results by 2.23% and 3.95% of accuracy on challenging tasks, respectively. Our code and data are available at https://github.com/dengwentao99/BCCS.

Related papers

Fairness-Aware Performance Evaluation for Multi-Party Multi-Objective Optimization [4.330406936708466]
We develop a fairness-aware performance evaluation framework for MPMOPs.<n>We formalize four axioms that a fairness-aware evaluation function for MPMOPs should satisfy.
arXiv Detail & Related papers (2026-01-30T03:09:58Z)
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia [100.74015791021044]
Large Language Model (LLM) agents have demonstrated impressive capabilities for social interaction.<n>Existing evaluation methods fail to measure how well these capabilities generalize to novel social situations.<n>We present empirical results from the NeurIPS 2024 Concordia Contest, where agents were evaluated on their ability to achieve mutual gains.
arXiv Detail & Related papers (2025-12-03T00:11:05Z)
Maestro: Learning to Collaborate via Conditional Listwise Policy Optimization for Multi-Agent LLMs [23.590034731179824]
We present Through Role Orchestration (Maestro), a principled paradigm for collaboration that structurally decouples cognitive modes.<n>Maestro uses a collective of parallel Execution Agents for diverse exploration and a specialized Central Agent for convergent, evaluative synthesis.<n>Experiments on mathematical reasoning and general problem-solving benchmarks demonstrate that Maestro, coupled with CLPO, consistently outperforms existing state-of-the-art multi-agent approaches.
arXiv Detail & Related papers (2025-11-08T21:01:27Z)
Multi-Agent Debate for LLM Judges with Adaptive Stability Detection [46.67172123607961]
We propose a multi-agent debate judge framework where agents collaboratively reason and iteratively refine their responses.<n>We formalize the debate process mathematically, analyzing agent interactions and proving that debate amplifies correctness compared to static ensembles.<n> Experiments across multiple benchmarks and models demonstrate that our framework improves judgment accuracy over majority voting while maintaining computational efficiency.
arXiv Detail & Related papers (2025-10-14T16:30:30Z)
CP-uniGuard: A Unified, Probability-Agnostic, and Adaptive Framework for Malicious Agent Detection and Defense in Multi-Agent Embodied Perception Systems [21.478631468402977]
Collaborative Perception (CP) has been shown to be a promising technique for multi-agent autonomous driving and multi-agent robotic systems.<n>In CP, an ego agent needs to receive messages from its collaborators, which makes it vulnerable to attacks from malicious agents.<n>We propose a unified, probability-agnostic, and adaptive framework, namely, CP-uniGuard, to accurately detect and eliminate malicious agents in its collaboration network.
arXiv Detail & Related papers (2025-06-28T14:02:14Z)
Reaching Consensus in Cooperative Multi-Agent Reinforcement Learning with Goal Imagination [16.74629849552254]
We propose a model-based consensus mechanism to explicitly coordinate multiple agents. The proposed Multi-agent Goal Imagination (MAGI) framework guides agents to reach consensus with an Imagined common goal. We show that such efficient consensus mechanism can guide all agents cooperatively reaching valuable future states.
arXiv Detail & Related papers (2024-03-05T18:07:34Z)
Pure Exploration in Asynchronous Federated Bandits [57.02106627533004]
We study the federated pure exploration problem of multi-armed bandits and linear bandits, where $M$ agents cooperatively identify the best arm via communicating with the central server. We propose the first asynchronous multi-armed bandit and linear bandit algorithms for pure exploration with fixed confidence.
arXiv Detail & Related papers (2023-10-17T06:04:00Z)
Pure Exploration under Mediators' Feedback [63.56002444692792]
Multi-armed bandits are a sequential-decision-making framework, where, at each interaction step, the learner selects an arm and observes a reward. We consider the scenario in which the learner has access to a set of mediators, each of which selects the arms on the agent's behalf according to a and possibly unknown policy. We propose a sequential decision-making strategy for discovering the best arm under the assumption that the mediators' policies are known to the learner.
arXiv Detail & Related papers (2023-08-29T18:18:21Z)
On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees. We study this question in a general framework for interactive decision making with multiple agents. We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z)
Trust-based Consensus in Multi-Agent Reinforcement Learning Systems [5.778852464898369]
This paper investigates the problem of unreliable agents in multi-agent reinforcement learning (MARL) We propose Reinforcement Learning-based Trusted Consensus (RLTC), a decentralized trust mechanism. We empirically demonstrate that our trust mechanism is able to handle unreliable agents effectively, as evidenced by higher consensus success rates.
arXiv Detail & Related papers (2022-05-25T15:58:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.