Coalitions of Large Language Models Increase the Robustness of AI Agents
- URL: http://arxiv.org/abs/2408.01380v1
- Date: Fri, 2 Aug 2024 16:37:44 GMT
- Title: Coalitions of Large Language Models Increase the Robustness of AI Agents
- Authors: Prattyush Mangal, Carol Mak, Theo Kanakis, Timothy Donovan, Dave Braines, Edward Pyzer-Knapp,
- Abstract summary: Large Language Models (LLMs) have fundamentally altered the way we interact with digital systems.
LLMs are powerful and capable of demonstrating some emergent properties, but struggle to perform well at all sub-tasks carried out by an AI agent.
We assess if a system comprising of a coalition of pretrained LLMs, each exhibiting specialised performance at individual sub-tasks, can match the performance of single model agents.
- Score: 3.216132991084434
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The emergence of Large Language Models (LLMs) have fundamentally altered the way we interact with digital systems and have led to the pursuit of LLM powered AI agents to assist in daily workflows. LLMs, whilst powerful and capable of demonstrating some emergent properties, are not logical reasoners and often struggle to perform well at all sub-tasks carried out by an AI agent to plan and execute a workflow. While existing studies tackle this lack of proficiency by generalised pretraining at a huge scale or by specialised fine-tuning for tool use, we assess if a system comprising of a coalition of pretrained LLMs, each exhibiting specialised performance at individual sub-tasks, can match the performance of single model agents. The coalition of models approach showcases its potential for building robustness and reducing the operational costs of these AI agents by leveraging traits exhibited by specific models. Our findings demonstrate that fine-tuning can be mitigated by considering a coalition of pretrained models and believe that this approach can be applied to other non-agentic systems which utilise LLMs.
Related papers
- MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [52.739500459903724]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation.
We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents.
We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z) - On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks.
We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly.
In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z) - On the limits of agency in agent-based models [13.130587222524305]
Agent-based modeling offers powerful insights into complex systems, but its practical utility has been limited by computational constraints.
Recent advancements in large language models (LLMs) could enhance ABMs with adaptive agents, but their integration into large-scale simulations remains challenging.
We present LLM archetypes, a technique that balances behavioral complexity with computational efficiency, allowing for nuanced agent behavior in large-scale simulations.
arXiv Detail & Related papers (2024-09-14T04:17:24Z) - xLAM: A Family of Large Action Models to Empower AI Agent Systems [111.5719694445345]
We release xLAM, a series of large action models designed for AI agent tasks.
xLAM consistently delivers exceptional performance across multiple agent ability benchmarks.
arXiv Detail & Related papers (2024-09-05T03:22:22Z) - WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence.
WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z) - Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning [56.82041895921434]
Open-source pre-trained Large Language Models (LLMs) exhibit strong language understanding and generation capabilities.
When used as agents for dealing with complex problems in the real world, their performance is far inferior to large commercial models such as ChatGPT and GPT-4.
arXiv Detail & Related papers (2024-03-29T03:48:12Z) - TrainerAgent: Customizable and Efficient Model Training through
LLM-Powered Multi-Agent System [14.019244136838017]
TrainerAgent is a multi-agent framework including Task, Data, Model and Server agents.
These agents analyze user-defined tasks, input data, and requirements (e.g., accuracy, speed), optimizing them from both data and model perspectives to obtain satisfactory models, and finally deploy these models as online service.
This research presents a significant advancement in achieving desired models with increased efficiency and quality as compared to traditional model development.
arXiv Detail & Related papers (2023-11-11T17:39:24Z) - SALMON: Self-Alignment with Instructable Reward Models [80.83323636730341]
This paper presents a novel approach, namely SALMON, to align base language models with minimal human supervision.
We develop an AI assistant named Dromedary-2 with only 6 exemplars for in-context learning and 31 human-defined principles.
arXiv Detail & Related papers (2023-10-09T17:56:53Z) - Model-based versus Model-free Deep Reinforcement Learning for Autonomous
Racing Cars [46.64253693115981]
This paper investigates how model-based deep reinforcement learning agents generalize to real-world autonomous-vehicle control-tasks.
We show that model-based agents capable of learning in imagination, substantially outperform model-free agents with respect to performance, sample efficiency, successful task completion, and generalization.
arXiv Detail & Related papers (2021-03-08T17:15:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.