On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents
- URL: http://arxiv.org/abs/2408.00989v3
- Date: Tue, 28 Jan 2025 07:45:50 GMT
- Title: On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents
- Authors: Jen-tse Huang, Jiaxu Zhou, Tailin Jin, Xuhui Zhou, Zixi Chen, Wenxuan Wang, Youliang Yuan, Michael R. Lyu, Maarten Sap,
- Abstract summary: Large language model-based multi-agent systems have shown great abilities across various tasks due to the collaboration of expert agents.
However, the impact of clumsy or even malicious agents, on the overall performance of the system remains underexplored.
This paper investigates what is the resilience of various system structures under faulty agents.
- Score: 58.79302663733703
- License:
- Abstract: Large language model-based multi-agent systems have shown great abilities across various tasks due to the collaboration of expert agents, each focusing on a specific domain. However, the impact of clumsy or even malicious agents, i.e., those who frequently make errors in their tasks, on the overall performance of the system remains underexplored. This paper investigates: (1) What is the resilience of various system structures (e.g., A$\rightarrow$B$\rightarrow$C, A$\leftrightarrow$B$\leftrightarrow$C) under faulty agents, on different downstream tasks? (2) How can we increase system resilience to defend against these agents? To simulate faulty agents, we propose two approaches, AutoTransform and AutoInject, which introduce mistakes into the agents' responses. We select four downstream tasks, including code generation, math problems, translation, and text evaluation. Results suggest that the hierarchical structure, i.e., A$\rightarrow$(B$\leftrightarrow$C), exhibits superior resilience with the lowest performance drop of $9.2\%$, compared to $26.0\%$ and $31.2\%$ of other two structures. Additionally, we improve the system resilience with two methods, introducing a mechanism for each agent to challenge others' outputs, and an additional agent to review and correct messages. Our code and data are available at https://github.com/CUHK-ARISE/MAS-Resilience.
Related papers
- Preventing Rogue Agents Improves Multi-Agent Collaboration [21.955058255432974]
Multi-agent systems, where specialized agents collaborate to solve a shared task hold great potential.
A single agent can cause the entire system to fail.
In this work, we propose to $textitmonitor$ agents during action prediction and $textitintervene$ when a future error is likely to occur.
arXiv Detail & Related papers (2025-02-09T18:35:08Z) - Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems [42.137278756052595]
$texttAgentPrune$ can seamlessly integrate into mainstream multi-agent systems.
textbf(I) integrates seamlessly into existing multi-agent frameworks with $28.1%sim72.8%downarrow$ token reduction.
textbf(III) successfully defend against two types of agent-based adversarial attacks with $3.5%sim10.8%uparrow$ performance boost.
arXiv Detail & Related papers (2024-10-03T14:14:31Z) - Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation [49.27250832754313]
We present AgentCOT, a llm-based autonomous agent framework.
At each step, AgentCOT selects an action and executes it to yield an intermediate result with supporting evidence.
We introduce two new strategies to enhance the performance of AgentCOT.
arXiv Detail & Related papers (2024-09-19T02:20:06Z) - Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.
We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.
We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z) - A Unified Debugging Approach via LLM-Based Multi-Agent Synergy [39.11825182386288]
FixAgent is an end-to-end framework for unified debug through multi-agent synergy.
It significantly outperforms state-of-the-art repair methods, fixing 1.25$times$ to 2.56$times$ bugs on the repo-level benchmark, Defects4J.
arXiv Detail & Related papers (2024-04-26T04:55:35Z) - Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models [56.00992369295851]
Open-sourced Large Language Models (LLMs) have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents.
This paper delivers three key observations: (1) the current agent training corpus is entangled with both formats following and agent reasoning, which significantly shifts from the distribution of its pre-training data; (2) LLMs exhibit different learning speeds on the capabilities required by agent tasks; and (3) current approaches have side-effects when improving agent abilities by introducing hallucinations.
We propose Agent-FLAN to effectively Fine-tune LANguage models for Agents.
arXiv Detail & Related papers (2024-03-19T16:26:10Z) - Agents meet OKR: An Object and Key Results Driven Agent System with
Hierarchical Self-Collaboration and Self-Evaluation [25.308341461293857]
OKR-Agent is designed to enhance the capabilities of Large Language Models (LLMs) in task-solving.
Our framework includes two novel modules: hierarchical Objects and Key Results generation and multi-level evaluation.
arXiv Detail & Related papers (2023-11-28T06:16:30Z) - A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration [55.35849138235116]
We propose automatically selecting a team of agents from candidates to collaborate in a dynamic communication structure toward different tasks and domains.
Specifically, we build a framework named Dynamic LLM-Powered Agent Network ($textDyLAN$) for LLM-powered agent collaboration.
We demonstrate that DyLAN outperforms strong baselines in code generation, decision-making, general reasoning, and arithmetic reasoning tasks with moderate computational cost.
arXiv Detail & Related papers (2023-10-03T16:05:48Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - Regret Bounds for Decentralized Learning in Cooperative Multi-Agent
Dynamical Systems [3.9599054392856488]
quadratic analysis is challenging in Multi-Agent Reinforcement Learning (MARL)
We propose a MARL algorithm based on the construction of an auxiliary single-agent LQ problem.
We show that our algorithm provides a $tildeO(sqrtT)$ regret bound.
arXiv Detail & Related papers (2020-01-27T23:37:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.