Investigating the Adaptive Robustness with Knowledge Conflicts in LLM-based Multi-Agent Systems
- URL: http://arxiv.org/abs/2502.15153v1
- Date: Fri, 21 Feb 2025 02:24:43 GMT
- Title: Investigating the Adaptive Robustness with Knowledge Conflicts in LLM-based Multi-Agent Systems
- Authors: Tianjie Ju, Bowen Wang, Hao Fei, Mong-Li Lee, Wynne Hsu, Yun Li, Qianren Wang, Pengzhou Cheng, Zongru Wu, Zhuosheng Zhang, Gongshen Liu,
- Abstract summary: We design four comprehensive metrics to investigate robustness of multi-agent systems (MASs)<n>We first analyze mild knowledge conflicts introduced by heterogeneous agents and find that they do not harm system robustness but instead improve collaborative decision-making.<n>Finally, we conduct ablation studies on the knowledge conflict number, agent number, and interaction rounds, finding that the self-repairing capability of MASs has intrinsic limits.
- Score: 39.390472904456836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in Large Language Models (LLMs) have upgraded them from sophisticated text generators to autonomous agents capable of corporation and tool use in multi-agent systems (MASs). However, the robustness of these LLM-based MASs, especially under knowledge conflicts, remains unclear. In this paper, we design four comprehensive metrics to investigate the robustness of MASs when facing mild or task-critical knowledge conflicts. We first analyze mild knowledge conflicts introduced by heterogeneous agents and find that they do not harm system robustness but instead improve collaborative decision-making. Next, we investigate task-critical knowledge conflicts by synthesizing knowledge conflicts and embedding them into one of the agents. Our results show that these conflicts have surprisingly little to no impact on MAS robustness. Furthermore, we observe that MASs demonstrate certain self-repairing capabilities by reducing their reliance on knowledge conflicts and adopting alternative solution paths to maintain stability. Finally, we conduct ablation studies on the knowledge conflict number, agent number, and interaction rounds, finding that the self-repairing capability of MASs has intrinsic limits, and all findings hold consistently across various factors. Our code is publicly available at https://github.com/wbw625/MultiAgentRobustness.
Related papers
- Why Do Multi-Agent LLM Systems Fail? [91.39266556855513]
We present MAST (Multi-Agent System Failure taxonomy), the first empirically grounded taxonomy designed to understand MAS failures.
We analyze seven popular MAS frameworks across over 200 tasks, involving six expert human annotators.
We identify 14 unique failure modes, organized into 3 overarching categories, (i) specification issues, (ii) inter-agent misalignment, and (iii) task verification.
arXiv Detail & Related papers (2025-03-17T19:04:38Z) - KOALA: Knowledge Conflict Augmentations for Robustness in Vision Language Models [6.52323086990482]
segsub is a framework that applies targeted perturbations to image sources to study and improve the robustness of vision language models.<n>Contrary to prior findings, we find VLMs are largely robust to image perturbation.<n>We find a link between hallucinations and image context, with GPT-4o prone to hallucination when presented with highly contextualized counterfactual examples.
arXiv Detail & Related papers (2025-02-19T00:26:38Z) - Is Cognition consistent with Perception? Assessing and Mitigating Multimodal Knowledge Conflicts in Document Understanding [15.828455477224516]
As a multimodal task, document understanding requires models to possess both perceptual and cognitive abilities.
In this paper, we define the conflicts between cognition and perception as Cognition and Perception (C&P) knowledge conflicts.
We propose a novel method called Multimodal Knowledge Consistency Fine-tuning to mitigate the C&P knowledge conflicts.
arXiv Detail & Related papers (2024-11-12T11:28:50Z) - Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs [55.74117540987519]
This paper explores the problem of commonsense-level vision-knowledge conflict in Multimodal Large Language Models (MLLMs)
We introduce an automated pipeline, augmented with human-in-the-loop quality control, to establish a benchmark aimed at simulating and assessing the conflicts in MLLMs.
We evaluate the conflict-resolution capabilities of nine representative MLLMs across various model families and find a noticeable over-reliance on textual queries.
arXiv Detail & Related papers (2024-10-10T17:31:17Z) - ECon: On the Detection and Resolution of Evidence Conflicts [56.89209046429291]
The rise of large language models (LLMs) has significantly influenced the quality of information in decision-making systems.
This study introduces a method for generating diverse, validated evidence conflicts to simulate real-world misinformation scenarios.
arXiv Detail & Related papers (2024-10-05T07:41:17Z) - ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM [36.332500824079844]
Large language models (LLMs) have achieved impressive advancements across numerous disciplines, yet the critical issue of knowledge conflicts has rarely been studied.
We present ConflictBank, the first comprehensive benchmark developed to evaluate knowledge conflicts from three aspects.
Our investigation delves into four model families and twelve LLM instances, meticulously analyzing conflicts stemming from misinformation, temporal discrepancies, and semantic divergences.
arXiv Detail & Related papers (2024-08-22T02:33:13Z) - Towards Rationality in Language and Multimodal Agents: A Survey [23.451887560567602]
This work discusses how to build more rational language and multimodal agents.<n> Rationality is quality of being guided by reason, characterized by decision-making that aligns with evidence and logical principles.
arXiv Detail & Related papers (2024-06-01T01:17:25Z) - MacGyver: Are Large Language Models Creative Problem Solvers? [87.70522322728581]
We explore the creative problem-solving capabilities of modern LLMs in a novel constrained setting.
We create MACGYVER, an automatically generated dataset consisting of over 1,600 real-world problems.
We present our collection to both LLMs and humans to compare and contrast their problem-solving abilities.
arXiv Detail & Related papers (2023-11-16T08:52:27Z) - Resolving Knowledge Conflicts in Large Language Models [46.903549751371415]
Large language models (LLMs) often encounter knowledge conflicts.
We ask what are the desiderata for LLMs when a knowledge conflict arises and whether existing LLMs fulfill them.
We introduce an evaluation framework for simulating contextual knowledge conflicts.
arXiv Detail & Related papers (2023-10-02T06:57:45Z) - Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMs [55.66353783572259]
Causal-Consistency Chain-of-Thought harnesses multi-agent collaboration to bolster the faithfulness and causality of foundation models.<n>Our framework demonstrates significant superiority over state-of-the-art methods through extensive and comprehensive evaluations.
arXiv Detail & Related papers (2023-08-23T04:59:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.