Related papers: Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering

Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering

URL: http://arxiv.org/abs/2505.03096v1
Date: Tue, 06 May 2025 01:13:14 GMT
Title: Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering
Authors: Joshua Owotogbe,
Abstract summary: This study explores the application of chaos engineering to enhance the robustness of Large Language Model-Based Multi-Agent Systems (LLM-MAS) in production-like environments under real-world conditions.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This study explores the application of chaos engineering to enhance the robustness of Large Language Model-Based Multi-Agent Systems (LLM-MAS) in production-like environments under real-world conditions. LLM-MAS can potentially improve a wide range of tasks, from answering questions and generating content to automating customer support and improving decision-making processes. However, LLM-MAS in production or preproduction environments can be vulnerable to emergent errors or disruptions, such as hallucinations, agent failures, and agent communication failures. This study proposes a chaos engineering framework to proactively identify such vulnerabilities in LLM-MAS, assess and build resilience against them, and ensure reliable performance in critical applications.

Related papers

SV-LLM: An Agentic Approach for SoC Security Verification using Large Language Models [8.912091484067508]
We introduce SV-LLM, a novel multi-agent assistant system designed to automate and enhance system-on-chip (SoC) security verification.<n>By integrating specialized agents for tasks like verification question answering, security asset identification, threat modeling, test plan and property generation, vulnerability detection, and simulation-based bug validation, SV-LLM streamlines the workflow.<n>The system aims to reduce manual intervention, improve accuracy, and accelerate security analysis, supporting proactive identification and mitigation of risks early in the design cycle.
arXiv Detail & Related papers (2025-06-25T13:31:13Z)
Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search [18.317836598695706]
Small language models (SLMs) offer promising and efficient alternatives to large language models (LLMs)<n>Our framework integrates a prompt template search mechanism to mitigate the SLMs' sensitivity to prompt variations.<n>Our approach improves the reasoning capabilities of SLMs without increasing model size or requiring additional training.
arXiv Detail & Related papers (2025-06-10T10:30:43Z)
Heterogeneous Group-Based Reinforcement Learning for LLM-based Multi-Agent Systems [25.882461853973897]
We propose Multi-Agent Heterogeneous Group Policy Optimization (MHGPO), which guides policy updates by estimating relative reward advantages.<n>MHGPO eliminates the need for Critic networks, enhancing stability and reducing computational overhead.<n>We also introduce three group rollout sampling strategies that trade off between efficiency and effectiveness.
arXiv Detail & Related papers (2025-06-03T10:17:19Z)
Comprehensive Vulnerability Analysis is Necessary for Trustworthy LLM-MAS [28.69485468744812]
Large Language Model-based Multi-Agent Systems (LLM-MAS) are increasingly deployed in high-stakes applications.<n>LLM-MAS introduces unique attack surfaces through inter-agent communication, trust relationships, and tool integration.<n>This paper presents a systematic framework for vulnerability analysis of LLM-MAS that unifies diverse research.
arXiv Detail & Related papers (2025-06-02T01:46:15Z)
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering [57.156093929365255]
Gym-style framework for systematically reinforcement learning, evaluating, and improving autonomous large language model (LLM) agents.<n>MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic engineering scenarios.<n>Its fully executable environment supports comprehensive agent training via both supervised fine-tuning and reinforcement learning.
arXiv Detail & Related papers (2025-05-12T17:35:43Z)
A Trustworthy Multi-LLM Network: Challenges,Solutions, and A Use Case [59.58213261128626]
We propose a blockchain-enabled collaborative framework that connects multiple Large Language Models (LLMs) into a Trustworthy Multi-LLM Network (MultiLLMN)<n>This architecture enables the cooperative evaluation and selection of the most reliable and high-quality responses to complex network optimization problems.
arXiv Detail & Related papers (2025-05-06T05:32:46Z)
LLMpatronous: Harnessing the Power of LLMs For Vulnerability Detection [0.0]
Large Language Models (LLMs) for vulnerability detection presents unique challenges.<n>Previous attempts employing machine learning models for vulnerability detection have proven ineffective.<n>We propose a robust AI-driven approach focused on mitigating these limitations.
arXiv Detail & Related papers (2025-04-25T15:30:40Z)
Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.<n>However, they still struggle with problems requiring multi-step decision-making and environmental feedback.<n>We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z)
Solution-oriented Agent-based Models Generation with Verifier-assisted Iterative In-context Learning [10.67134969207797]
Agent-based models (ABMs) stand as an essential paradigm for proposing and validating hypothetical solutions or policies. Large language models (LLMs) encapsulating cross-domain knowledge and programming proficiency could potentially alleviate the difficulty of this process. We present SAGE, a general solution-oriented ABM generation framework designed for automatic modeling and generating solutions for targeted problems.
arXiv Detail & Related papers (2024-02-04T07:59:06Z)
AgentBench: Evaluating LLMs as Agents [88.45506148281379]
Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks. We present AgentBench, a benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities.
arXiv Detail & Related papers (2023-08-07T16:08:11Z)
On the Risk of Misinformation Pollution with Large Language Models [127.1107824751703]
We investigate the potential misuse of modern Large Language Models (LLMs) for generating credible-sounding misinformation. Our study reveals that LLMs can act as effective misinformation generators, leading to a significant degradation in the performance of Open-Domain Question Answering (ODQA) systems.
arXiv Detail & Related papers (2023-05-23T04:10:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.