Related papers: Coalitions of Large Language Models Increase the Robustness of AI Agents

Coalitions of Large Language Models Increase the Robustness of AI Agents

URL: http://arxiv.org/abs/2408.01380v1
Date: Fri, 2 Aug 2024 16:37:44 GMT
Title: Coalitions of Large Language Models Increase the Robustness of AI Agents
Authors: Prattyush Mangal, Carol Mak, Theo Kanakis, Timothy Donovan, Dave Braines, Edward Pyzer-Knapp,
Abstract summary: Large Language Models (LLMs) have fundamentally altered the way we interact with digital systems. LLMs are powerful and capable of demonstrating some emergent properties, but struggle to perform well at all sub-tasks carried out by an AI agent. We assess if a system comprising of a coalition of pretrained LLMs, each exhibiting specialised performance at individual sub-tasks, can match the performance of single model agents.
Score: 3.216132991084434
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The emergence of Large Language Models (LLMs) have fundamentally altered the way we interact with digital systems and have led to the pursuit of LLM powered AI agents to assist in daily workflows. LLMs, whilst powerful and capable of demonstrating some emergent properties, are not logical reasoners and often struggle to perform well at all sub-tasks carried out by an AI agent to plan and execute a workflow. While existing studies tackle this lack of proficiency by generalised pretraining at a huge scale or by specialised fine-tuning for tool use, we assess if a system comprising of a coalition of pretrained LLMs, each exhibiting specialised performance at individual sub-tasks, can match the performance of single model agents. The coalition of models approach showcases its potential for building robustness and reducing the operational costs of these AI agents by leveraging traits exhibited by specific models. Our findings demonstrate that fine-tuning can be mitigated by considering a coalition of pretrained models and believe that this approach can be applied to other non-agentic systems which utilise LLMs.

Related papers

SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model [88.04128601981145]
We introduce SimuRA, a goal-oriented architecture for generalized agentic reasoning.<n>modelname overcomes the limitations of autoregressive reasoning by introducing a world model for planning via simulation.<n>World-model-based planning, in particular, shows consistent advantage of up to 124% over autoregressive planning.
arXiv Detail & Related papers (2025-07-31T17:57:20Z)
Agentic Reinforced Policy Optimization [66.96989268893932]
Large-scale reinforcement learning with verifiable rewards (RLVR) has demonstrated its effectiveness in harnessing the potential of large language models (LLMs) for single-turn reasoning tasks.<n>Current RL algorithms inadequately balance the models' intrinsic long-horizon reasoning capabilities and their proficiency in multi-turn tool interactions.<n>We propose Agentic Reinforced Policy Optimization (ARPO), a novel agentic RL algorithm tailored for training multi-turn LLM-based agents.
arXiv Detail & Related papers (2025-07-26T07:53:11Z)
Multiple Weaks Win Single Strong: Large Language Models Ensemble Weak Reinforcement Learning Agents into a Supreme One [28.264011412168347]
Model ensemble is a useful approach in reinforcement learning (RL) for training effective agents.<n>We propose LLM-Ens, a novel approach that enhances RL model ensemble with task-specific semantic understandings.
arXiv Detail & Related papers (2025-05-21T09:35:43Z)
MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [52.739500459903724]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation. We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents. We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z)
On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks. We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly. In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z)
On the limits of agency in agent-based models [13.130587222524305]
Agent-based modeling offers powerful insights into complex systems, but its practical utility has been limited by computational constraints. Recent advancements in large language models (LLMs) could enhance ABMs with adaptive agents, but their integration into large-scale simulations remains challenging. We present LLM archetypes, a technique that balances behavioral complexity with computational efficiency, allowing for nuanced agent behavior in large-scale simulations.
arXiv Detail & Related papers (2024-09-14T04:17:24Z)
xLAM: A Family of Large Action Models to Empower AI Agent Systems [111.5719694445345]
We release xLAM, a series of large action models designed for AI agent tasks. xLAM consistently delivers exceptional performance across multiple agent ability benchmarks.
arXiv Detail & Related papers (2024-09-05T03:22:22Z)
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence. WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z)
Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning [56.82041895921434]
Open-source pre-trained Large Language Models (LLMs) exhibit strong language understanding and generation capabilities. When used as agents for dealing with complex problems in the real world, their performance is far inferior to large commercial models such as ChatGPT and GPT-4.
arXiv Detail & Related papers (2024-03-29T03:48:12Z)
TrainerAgent: Customizable and Efficient Model Training through LLM-Powered Multi-Agent System [14.019244136838017]
TrainerAgent is a multi-agent framework including Task, Data, Model and Server agents. These agents analyze user-defined tasks, input data, and requirements (e.g., accuracy, speed), optimizing them from both data and model perspectives to obtain satisfactory models, and finally deploy these models as online service. This research presents a significant advancement in achieving desired models with increased efficiency and quality as compared to traditional model development.
arXiv Detail & Related papers (2023-11-11T17:39:24Z)
SALMON: Self-Alignment with Instructable Reward Models [80.83323636730341]
This paper presents a novel approach, namely SALMON, to align base language models with minimal human supervision. We develop an AI assistant named Dromedary-2 with only 6 exemplars for in-context learning and 31 human-defined principles.
arXiv Detail & Related papers (2023-10-09T17:56:53Z)
Model-based versus Model-free Deep Reinforcement Learning for Autonomous Racing Cars [46.64253693115981]
This paper investigates how model-based deep reinforcement learning agents generalize to real-world autonomous-vehicle control-tasks. We show that model-based agents capable of learning in imagination, substantially outperform model-free agents with respect to performance, sample efficiency, successful task completion, and generalization.
arXiv Detail & Related papers (2021-03-08T17:15:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.