Related papers: Adaptive Confidence Gating in Multi-Agent Collaboration for Efficient and Optimized Code Generation

Adaptive Confidence Gating in Multi-Agent Collaboration for Efficient and Optimized Code Generation

URL: http://arxiv.org/abs/2601.21469v1
Date: Thu, 29 Jan 2026 09:48:15 GMT
Title: Adaptive Confidence Gating in Multi-Agent Collaboration for Efficient and Optimized Code Generation
Authors: Haoji Zhang, Yuzhe Li, Zhenqiang Liu, Chenyang Liu, Shenyang Zhang, Yi Zhou,
Abstract summary: DebateCoder is a multi-agent collaborative framework designed to improve the reasoning ability of Small Language Models (SLMs)<n>It uses a structured role-playing protocol with three agents: User Agent (A_UA), Technical Agent (A_TA), and Quality Assurance Agent (A_QA)<n>It also includes an Adaptive Confidence Gating mechanism with a 95% threshold to balance accuracy and inference efficiency.
Score: 13.994379905835716
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While Large Language Models (LLMs) have catalyzed breakthroughs in automated code generation, Small Language Models (SLMs) often encounter reasoning bottlenecks and failure loops when addressing complex logical requirements. To overcome these challenges, we propose DebateCoder, a multi-agent collaborative framework designed to improve the reasoning ability of SLMs (e.g., Pangu-1B) in resource-constrained environments. DebateCoder uses a structured role-playing protocol with three agents: User Agent (A_UA), Technical Agent (A_TA), and Quality Assurance Agent (A_QA). It also includes an Adaptive Confidence Gating mechanism with a 95% threshold to balance accuracy and inference efficiency. In addition, we introduce a multi-turn deliberation module and a reviewer-guided analytical debugging loop for orthogonal pre-generation debate and post-generation refinement. Experiments on HumanEval and MBPP show that DebateCoder achieves 70.12% Pass@1 on HumanEval, outperforming MapCoder while reducing API overhead by about 35%. These results indicate that collaborative protocols can mitigate limitations of small-parameter models and provide a scalable, efficient approach to high-quality automated software engineering.

Related papers

AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework [4.782965804438204]
Large Language Models (LLMs) demonstrate potentials for automating scientific code generation but face challenges in reliability, error propagation, and evaluation.<n>We present a Bayesian adversarial multi-agent framework specifically designed for AI for Science (AI4S) tasks in the form of a Low-code Platform (LCP)<n>Three LLM-based agents are coordinated under the Bayesian framework: a Task Manager that structures user inputs into actionable plans and adaptive test cases, a Code Generator that produces candidate solutions, and an Evaluator providing comprehensive feedback.
arXiv Detail & Related papers (2026-03-03T18:25:00Z)
AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent [57.10083973844841]
AgentArk is a novel framework to distill multi-agent dynamics into the weights of a single model.<n>We investigate three hierarchical distillation strategies across various models, tasks, scaling, and scenarios.<n>By shifting the burden of computation from inference to training, the distilled models preserve the efficiency of one agent while exhibiting strong reasoning and self-correction performance of multiple agents.
arXiv Detail & Related papers (2026-02-03T19:18:28Z)
Agentic Proposing: Enhancing Large Language Model Reasoning via Compositional Skill Synthesis [10.951981109673119]
Agentic Proposing is a framework that models problem synthesis as a goal-driven sequential decision process.<n>It generates high-precision, verifiable training trajectories across mathematics, coding, and science.<n>A 30B solver trained on only 11,000 synthesized trajectories achieves a state-of-the-art 91.6% accuracy on AIME25.
arXiv Detail & Related papers (2026-02-03T09:02:53Z)
ComAgent: Multi-LLM based Agentic AI Empowered Intelligent Wireless Networks [62.031889234230725]
6G networks rely on complex cross-layer optimization.<n> manually translating high-level intents into mathematical formulations remains a bottleneck.<n>We present ComAgent, a multi-LLM agentic AI framework.
arXiv Detail & Related papers (2026-01-27T13:43:59Z)
Towards Efficient Agents: A Co-Design of Inference Architecture and System [66.59916327634639]
This paper presents AgentInfer, a unified framework for end-to-end agent acceleration.<n>We decompose the problem into four synergistic components: AgentCollab, AgentSched, AgentSAM, and AgentCompress.<n>Experiments on the BrowseComp-zh and DeepDiver benchmarks demonstrate that through the synergistic collaboration of these methods, AgentInfer reduces ineffective token consumption by over 50%.
arXiv Detail & Related papers (2025-12-20T12:06:13Z)
Benefits and Limitations of Communication in Multi-Agent Reasoning [11.788489289062312]
We propose a theoretical framework to analyze the expressivity of multi-agent systems.<n>We derive bounds on (i) the number of agents required to solve the task exactly, (ii) the quantity and structure of inter-agent communication, and (iii) the achievable speedups as problem size and context scale.<n>Our results identify regimes where communication is provably beneficial, delineate tradeoffs between agent count and bandwidth, and expose intrinsic limitations when either resource is constrained.
arXiv Detail & Related papers (2025-10-14T20:04:27Z)
Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models [99.85131798240808]
We introduce a novel generative framework called textitGuided Topology Diffusion (GTD)<n>Inspired by conditional discrete graph diffusion models, GTD formulates topology synthesis as an iterative construction process.<n>At each step, the generation is steered by a lightweight proxy model that predicts multi-objective rewards.<n>Experiments show that GTD can generate highly task-adaptive, sparse, and efficient communication topologies.
arXiv Detail & Related papers (2025-10-09T05:28:28Z)
AgentAsk: Multi-Agent Systems Need to Ask [26.13279490836716]
Multi-agent systems built on large language models (LLMs) promise enhanced problem-solving capabilities through collaborative division of labor.<n>We propose AgentAsk, a lightweight and plug-and-play clarification module that treats every inter-agent message as a potential failure point and inserts minimally necessary questions to arrest error propagation.<n>AgentAsk consistently improves accuracy and robustness over public multi-agent implementations while keeping overhead minimal, with latency and extra cost all less than 5%.
arXiv Detail & Related papers (2025-10-08T22:36:05Z)
Reasoning-Aware Prompt Orchestration: A Foundation Model for Multi-Agent Language Model Coordination [0.0]
We present a theoretically-grounded framework for dynamic prompt orchestration that enhances reasoning across multiple specialized agents.<n>This framework addresses three core challenges: logical consistency preservation during agent transitions, reasoning-aware prompt adaptation, and scalable coordination of distributed inference.<n> Experimental results on 1,000 synthetic multi-agent conversations demonstrate a 42% reduction in reasoning latency, a 23% improvement in logical consistency measured by ROUGE-L score, and an 89% success rate for task completion without context loss.
arXiv Detail & Related papers (2025-09-30T22:33:01Z)
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models.<n>Controlled Decoding provides a mechanism for aligning a model at inference time without retraining.<n>We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z)
MaCTG: Multi-Agent Collaborative Thought Graph for Automatic Programming [10.461509044478278]
MaCTG (MultiAgent Collaborative Thought Graph) is a novel multi-agent framework that employs a dynamic graph structure.<n>It autonomously assigns agent roles based on programming requirements, dynamically refines task distribution, and systematically verifies and integrates project-level code.<n>MaCTG significantly reduced operational costs by 89.09% compared to existing multi-agent frameworks.
arXiv Detail & Related papers (2024-10-25T01:52:15Z)
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization [49.362750475706235]
Reinforcement Learning (RL) plays a crucial role in aligning large language models with human preferences and improving their ability to perform complex tasks.<n>We introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model.<n> Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
arXiv Detail & Related papers (2024-10-11T23:29:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.