Related papers: Managing Uncertainty in LLM-based Multi-Agent System Operation

Managing Uncertainty in LLM-based Multi-Agent System Operation

URL: http://arxiv.org/abs/2602.23005v1
Date: Thu, 26 Feb 2026 13:49:16 GMT
Title: Managing Uncertainty in LLM-based Multi-Agent System Operation
Authors: Man Zhang, Tao Yue, Yihua He,
Abstract summary: We propose a lifecycle-based uncertainty management framework for safety-critical multi-agent software systems.<n>We demonstrate the feasibility of the framework using a real-world LLM-based multi-agent echocardiographic software system developed in clinical collaboration.
Score: 4.565220771764574
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Applying LLM-based multi-agent software systems in safety-critical domains such as lifespan echocardiography introduces system-level risks that cannot be addressed by improving model accuracy alone. During system operation, beyond individual LLM behavior, uncertainty propagates through agent coordination, data pipelines, human-in-the-loop interaction, and runtime control logic. Yet existing work largely treats uncertainty at the model level rather than as a first-class software engineering concern. This paper approaches uncertainty from both system-level and runtime perspectives. We first differentiate epistemological and ontological uncertainties in the context of LLM-based multi-agent software system operation. Building on this foundation, we propose a lifecycle-based uncertainty management framework comprising four mechanisms: representation, identification, evolution, and adaptation. The uncertainty lifecycle governs how uncertainties emerge, transform, and are mitigated across architectural layers and execution phases, enabling structured runtime governance and controlled adaptation. We demonstrate the feasibility of the framework using a real-world LLM-based multi-agent echocardiographic software system developed in clinical collaboration, showing improved reliability and diagnosability in diagnostic reasoning. The proposed approach generalizes to other safety-critical LLM-based multi-agent software systems, supporting principled operational control and runtime assurance beyond model-centric methods.

Related papers

Beyond Single-Agent Safety: A Taxonomy of Risks in LLM-to-LLM Interactions [0.0]
This paper examines why safety mechanisms designed for human-model interaction do not scale to environments where large language models interact with each other.<n>We propose a conceptual transition from model-level safety to system-level safety, introducing the framework of the Emergent Systemic Risk Horizon (ESRH)<n>The paper contributes (i) a theoretical account of collective risk in interacting LLMs, (ii) a taxonomy connecting micro, meso, and macro-level failure modes, and (iii) a design proposal for InstitutionalAI, an architecture for embedding adaptive oversight within multi-agent systems.
arXiv Detail & Related papers (2025-12-02T12:06:57Z)
SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z)
Failure Modes in LLM Systems: A System-Level Taxonomy for Reliable AI Applications [0.0]
Large language models (LLMs) are being rapidly integrated into decision-support tools, automation, and AI-enabled software systems.<n>This paper presents a system-level taxonomy of fifteen hidden failure modes that arise in real-world LLM applications.
arXiv Detail & Related papers (2025-11-25T05:19:23Z)
The Subtle Art of Defection: Understanding Uncooperative Behaviors in LLM based Multi-Agent Systems [22.357102759752234]
This paper introduces a novel framework for simulating and analyzing how uncooperative behaviors can destabilize or collapse multi-agent systems.<n>Our framework includes two key components: (1) a game theory-based taxonomy of uncooperative agent behaviors, and (2) a multi-stage simulation pipeline that dynamically generates and refines uncooperative behaviors as agents' states evolve.
arXiv Detail & Related papers (2025-11-19T20:39:19Z)
Fundamentals of Building Autonomous LLM Agents [64.39018305018904]
This paper reviews the architecture and implementation methods of agents powered by large language models (LLMs)<n>The research aims to explore patterns to develop "agentic" LLMs that can automate complex tasks and bridge the performance gap with human capabilities.
arXiv Detail & Related papers (2025-10-10T10:32:39Z)
A Comprehensive Survey on Benchmarks and Solutions in Software Engineering of LLM-Empowered Agentic System [56.40989626804489]
This survey provides the first holistic analysis of Large Language Models-powered software engineering.<n>We review over 150 recent papers and propose a taxonomy along two key dimensions: (1) Solutions, categorized into prompt-based, fine-tuning-based, and agent-based paradigms, and (2) Benchmarks, including tasks such as code generation, translation, and repair.
arXiv Detail & Related papers (2025-10-10T06:56:50Z)
Risk Analysis Techniques for Governed LLM-based Multi-Agent Systems [0.0]
This report addresses the early stages of risk identification and analysis for multi-agent AI systems.<n>We examine six critical failure modes: cascading reliability failures, inter-agent communication failures, monoculture collapse, conformity bias, deficient theory of mind, and mixed motive dynamics.
arXiv Detail & Related papers (2025-08-06T06:06:57Z)
Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey [69.45421620616486]
This work presents the first structured taxonomy and analysis of discrete tokenization methods designed for large language models (LLMs)<n>We categorize 8 representative VQ variants that span classical and modern paradigms and analyze their algorithmic principles, training dynamics, and integration challenges with LLM pipelines.<n>We identify key challenges including codebook collapse, unstable gradient estimation, and modality-specific encoding constraints.
arXiv Detail & Related papers (2025-07-21T10:52:14Z)
A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems [93.8285345915925]
Reasoning is a fundamental cognitive process that enables logical inference, problem-solving, and decision-making.<n>With the rapid advancement of large language models (LLMs), reasoning has emerged as a key capability that distinguishes advanced AI systems.<n>We categorize existing methods along two dimensions: (1) Regimes, which define the stage at which reasoning is achieved; and (2) Architectures, which determine the components involved in the reasoning process.
arXiv Detail & Related papers (2025-04-12T01:27:49Z)
Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning. We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z)
Lyapunov-Based Reinforcement Learning for Decentralized Multi-Agent Control [3.3788926259119645]
In decentralized multi-agent control, systems are complex with unknown or highly uncertain dynamics. Deep reinforcement learning (DRL) is promising to learn the controller/policy from data without the knowing system dynamics. Existing multi-agent reinforcement learning (MARL) algorithms cannot ensure the closed-loop stability of a multi-agent system. We propose a new MARL algorithm for decentralized multi-agent control with a stability guarantee.
arXiv Detail & Related papers (2020-09-20T06:11:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.