Related papers: Risk Analysis Techniques for Governed LLM-based Multi-Agent Systems

Risk Analysis Techniques for Governed LLM-based Multi-Agent Systems

URL: http://arxiv.org/abs/2508.05687v1
Date: Wed, 06 Aug 2025 06:06:57 GMT
Title: Risk Analysis Techniques for Governed LLM-based Multi-Agent Systems
Authors: Alistair Reid, Simon O'Callaghan, Liam Carroll, Tiberio Caetano,
Abstract summary: This report addresses the early stages of risk identification and analysis for multi-agent AI systems.<n>We examine six critical failure modes: cascading reliability failures, inter-agent communication failures, monoculture collapse, conformity bias, deficient theory of mind, and mixed motive dynamics.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Organisations are starting to adopt LLM-based AI agents, with their deployments naturally evolving from single agents towards interconnected, multi-agent networks. Yet a collection of safe agents does not guarantee a safe collection of agents, as interactions between agents over time create emergent behaviours and induce novel failure modes. This means multi-agent systems require a fundamentally different risk analysis approach than that used for a single agent. This report addresses the early stages of risk identification and analysis for multi-agent AI systems operating within governed environments where organisations control their agent configurations and deployment. In this setting, we examine six critical failure modes: cascading reliability failures, inter-agent communication failures, monoculture collapse, conformity bias, deficient theory of mind, and mixed motive dynamics. For each, we provide a toolkit for practitioners to extend or integrate into their existing frameworks to assess these failure modes within their organisational contexts. Given fundamental limitations in current LLM behavioural understanding, our approach centres on analysis validity, and advocates for progressively increasing validity through staged testing across stages of abstraction and deployment that gradually increases exposure to potential negative impacts, while collecting convergent evidence through simulation, observational analysis, benchmarking, and red teaming. This methodology establishes the groundwork for robust organisational risk management as these LLM-based multi-agent systems are deployed and operated.

Related papers

OMNI-LEAK: Orchestrator Multi-Agent Network Induced Data Leakage [59.3826294523924]
We investigate the security vulnerabilities of a popular multi-agent pattern known as the orchestrator setup.<n>We report the susceptibility of frontier models to different categories of attacks, finding that both reasoning and non-reasoning models are vulnerable.
arXiv Detail & Related papers (2026-02-13T21:32:32Z)
The Why Behind the Action: Unveiling Internal Drivers via Agentic Attribution [63.61358761489141]
Large Language Model (LLM)-based agents are widely used in real-world applications such as customer service, web navigation, and software engineering.<n>We propose a novel framework for textbfgeneral agentic attribution, designed to identify the internal factors driving agent actions regardless of the task outcome.<n>We validate our framework across a diverse suite of agentic scenarios, including standard tool use and subtle reliability risks like memory-induced bias.
arXiv Detail & Related papers (2026-01-21T15:22:21Z)
Beyond Task Completion: An Assessment Framework for Evaluating Agentic AI Systems [0.0]
Recent advances in agentic AI have shifted the focus from standalone Large Language Models to integrated systems.<n>We propose an end-to-end Agent Assessment Framework with four evaluation pillars encompassing LLMs, Memory, Tools, and Environment.<n>We validate the framework on a representative Autonomous CloudOps use case, where experiments reveal behavioral deviations by conventional metrics.
arXiv Detail & Related papers (2025-12-14T18:17:40Z)
The Evolution of Agentic AI in Cybersecurity: From Single LLM Reasoners to Multi-Agent Systems and Autonomous Pipelines [0.0]
Cybersecurity has become one of the earliest adopters of agentic AI.<n>This survey presents a five-generation taxonomy of agentic AI in cybersecurity.
arXiv Detail & Related papers (2025-12-07T05:10:16Z)
SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z)
A Survey on Agentic Multimodal Large Language Models [84.18778056010629]
We present a comprehensive survey on Agentic Multimodal Large Language Models (Agentic MLLMs)<n>We explore the emerging paradigm of agentic MLLMs, delineating their conceptual foundations and distinguishing characteristics from conventional MLLM-based agents.<n>To further accelerate research in this area for the community, we compile open-source training frameworks, training and evaluation datasets for developing agentic MLLMs.
arXiv Detail & Related papers (2025-10-13T04:07:01Z)
CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage [10.088447487211893]
Security Operations Centers (SOCs) are overwhelmed by tens of thousands of daily alerts.<n>This overload creates alert fatigue, leading to overlooked threats and analyst burnout.<n>We propose CORTEX, a multi-agent LLM architecture for high-stakes alert triage.
arXiv Detail & Related papers (2025-09-30T22:09:31Z)
Who's the Mole? Modeling and Detecting Intention-Hiding Malicious Agents in LLM-Based Multi-Agent Systems [25.6233463223145]
We study intention-hiding threats in multi-agent systems powered by Large Language Models (LLM-MAS)<n>We design four representative attack paradigms that subtly disrupt task completion while maintaining a high degree of stealth.<n>To counter these threats, we propose AgentXposed, a psychology-inspired detection framework.
arXiv Detail & Related papers (2025-07-07T07:34:34Z)
SafeMobile: Chain-level Jailbreak Detection and Automated Evaluation for Multimodal Mobile Agents [58.21223208538351]
This work explores the security issues surrounding mobile multimodal agents.<n>It attempts to construct a risk discrimination mechanism by incorporating behavioral sequence information.<n>It also designs an automated assisted assessment scheme based on a large language model.
arXiv Detail & Related papers (2025-07-01T15:10:00Z)
Kaleidoscopic Teaming in Multi Agent Simulations [75.47388708240042]
We argue that existing red teaming or safety evaluation frameworks fall short in evaluating safety risks in complex behaviors, thought processes and actions taken by agents.<n>We introduce new in-context optimization techniques that can be used in our kaleidoscopic teaming framework to generate better scenarios for safety analysis.<n>We present appropriate metrics that can be used along with our framework to measure safety of agents.
arXiv Detail & Related papers (2025-06-20T23:37:17Z)
TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems [2.462408812529728]
This review presents a structured analysis of textbfTrust, Risk, and Security Management (TRiSM) in the context of LLM-based Agentic Multi-Agent Systems (AMAS)<n>We begin by examining the conceptual foundations of Agentic AI and highlight its architectural distinctions from traditional AI agents.<n>We then adapt and extend the AI TRiSM framework for Agentic AI, structured around four key pillars: Explainability, ModelOps, Security, Privacy and Governance.
arXiv Detail & Related papers (2025-06-04T16:26:11Z)
Assessing Collective Reasoning in Multi-Agent LLMs via Hidden Profile Tasks [5.120446836495469]
We introduce the Hidden Profile paradigm from social psychology as a diagnostic testbed for multi-agent LLM systems.<n>By distributing critical information asymmetrically across agents, the paradigm reveals how inter-agent dynamics support or hinder collective reasoning.<n>We find that while cooperative agents are prone to over-coordination in collective settings, increased contradiction impairs group convergence.
arXiv Detail & Related papers (2025-05-15T19:22:54Z)
Multi-Agent Risks from Advanced AI [90.74347101431474]
Multi-agent systems of advanced AI pose novel and under-explored risks.<n>We identify three key failure modes based on agents' incentives, as well as seven key risk factors.<n>We highlight several important instances of each risk, as well as promising directions to help mitigate them.
arXiv Detail & Related papers (2025-02-19T23:03:21Z)
Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents [67.07177243654485]
This survey collects and analyzes the different threats faced by large language models-based agents. We identify six key features of LLM-based agents, based on which we summarize the current research progress. We select four representative agents as case studies to analyze the risks they may face in practical use.
arXiv Detail & Related papers (2024-11-14T15:40:04Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards [1.179778723980276]
Multi-agent Reinforcement Learning (MARL) is emerging as a key framework for sequential decision-making and control tasks. The deployment of these systems in real-world scenarios often requires decentralized training, a diverse set of agents, and learning from infrequent environmental reward signals. We propose the CoHet algorithm, which utilizes a novel Graph Neural Network (GNN) based intrinsic motivation to facilitate the learning of heterogeneous agent policies.
arXiv Detail & Related papers (2024-08-12T21:38:40Z)
AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents [74.16170899755281]
We introduce AgentBoard, a pioneering comprehensive benchmark and accompanied open-source evaluation framework tailored to analytical evaluation of LLM agents.<n>AgentBoard offers a fine-grained progress rate metric that captures incremental advancements as well as a comprehensive evaluation toolkit.<n>This not only sheds light on the capabilities and limitations of LLM agents but also propels the interpretability of their performance to the forefront.
arXiv Detail & Related papers (2024-01-24T01:51:00Z)
Relative Distributed Formation and Obstacle Avoidance with Multi-agent Reinforcement Learning [20.401609420707867]
We propose a distributed formation and obstacle avoidance method based on multi-agent reinforcement learning (MARL) Our method achieves better performance regarding formation error, formation convergence rate and on-par success rate of obstacle avoidance compared with baselines.
arXiv Detail & Related papers (2021-11-14T13:02:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.