Related papers: Aligning Agents like Large Language Models

Aligning Agents like Large Language Models

URL: http://arxiv.org/abs/2406.04208v1
Date: Thu, 6 Jun 2024 16:05:45 GMT
Title: Aligning Agents like Large Language Models
Authors: Adam Jelley, Yuhan Cao, Dave Bignell, Sam Devlin, Tabish Rashid,
Abstract summary: Training agents to behave as desired in complex 3D environments from high-dimensional sensory information is challenging. We draw an analogy between the undesirable behaviors of imitation learning agents and the unhelpful responses of unaligned large language models (LLMs) We demonstrate that we can align our agent to consistently perform the desired mode, while providing insights and advice for successfully applying this approach to training agents.
Score: 8.873319874424167
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training agents to behave as desired in complex 3D environments from high-dimensional sensory information is challenging. Imitation learning from diverse human behavior provides a scalable approach for training an agent with a sensible behavioral prior, but such an agent may not perform the specific behaviors of interest when deployed. To address this issue, we draw an analogy between the undesirable behaviors of imitation learning agents and the unhelpful responses of unaligned large language models (LLMs). We then investigate how the procedure for aligning LLMs can be applied to aligning agents in a 3D environment from pixels. For our analysis, we utilize an academically illustrative part of a modern console game in which the human behavior distribution is multi-modal, but we want our agent to imitate a single mode of this behavior. We demonstrate that we can align our agent to consistently perform the desired mode, while providing insights and advice for successfully applying this approach to training agents. Project webpage at https://adamjelley.github.io/aligning-agents-like-llms .

Related papers

Agent Lightning: Train ANY AI Agents with Reinforcement Learning [24.13422767414729]
We present Agent Lightning, a framework that enables Reinforcement Learning (RL)-based training of Large Language Models (LLMs) for any AI agent.<n>By formulating agent execution as Markov decision process, we define an unified data interface and propose a hierarchical RL algorithm, LightningRL, which contains a credit assignment module.<n>For the system design, we introduce a Training-Agent Disaggregation architecture, and brings agent observability frameworks into agent runtime.
arXiv Detail & Related papers (2025-08-05T17:50:13Z)
Rethinking Agent Design: From Top-Down Workflows to Bottom-Up Skill Evolution [34.66260172204154]
We introduce a bottom-up agent paradigm that mirrors the human learning process.<n>Agents acquire competence through a trial-and-reasoning mechanism-exploring, reflecting on outcomes, and abstracting skills over time.<n>We evaluate this paradigm in Slay the Spire and Civilization V, where agents perceive through raw visual inputs and act via mouse outputs, the same as human players.
arXiv Detail & Related papers (2025-05-23T09:38:55Z)
Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark [72.46357004059661]
We propose Similar, a step-wise Multi-dimensional Generalist Reward Model. It offers fine-grained signals for agent training and can choose better action for inference-time scaling. We introduce the first benchmark in the virtual agent domain for step-wise, multi-dimensional reward model training and evaluation.
arXiv Detail & Related papers (2025-03-24T13:30:47Z)
AgentRefine: Enhancing Agent Generalization through Refinement Tuning [28.24897427451803]
Large Language Model (LLM) based agents have proved their ability to perform complex tasks like humans. There is still a large gap between open-sourced LLMs and commercial models like the GPT series. In this paper, we focus on improving the agent generalization capabilities of LLMs via instruction tuning.
arXiv Detail & Related papers (2025-01-03T08:55:19Z)
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments [116.97648507802926]
Large language models (LLMs) are considered a promising foundation to build such agents. We take the first step towards building generally-capable LLM-based agents with self-evolution ability. We propose AgentGym, a new framework featuring a variety of environments and tasks for broad, real-time, uni-format, and concurrent agent exploration.
arXiv Detail & Related papers (2024-06-06T15:15:41Z)
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models [56.00992369295851]
Open-sourced Large Language Models (LLMs) have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents. This paper delivers three key observations: (1) the current agent training corpus is entangled with both formats following and agent reasoning, which significantly shifts from the distribution of its pre-training data; (2) LLMs exhibit different learning speeds on the capabilities required by agent tasks; and (3) current approaches have side-effects when improving agent abilities by introducing hallucinations. We propose Agent-FLAN to effectively Fine-tune LANguage models for Agents.
arXiv Detail & Related papers (2024-03-19T16:26:10Z)
Understanding Your Agent: Leveraging Large Language Models for Behavior Explanation [7.647395374489533]
We propose an approach to generate natural language explanations for an agent's behavior based only on observations of states and actions. We show that our approach generates explanations as helpful as those produced by a human domain expert.
arXiv Detail & Related papers (2023-11-29T20:16:23Z)
On Generative Agents in Recommendation [58.42840923200071]
Agent4Rec is a user simulator in recommendation based on Large Language Models. Each agent interacts with personalized recommender models in a page-by-page manner.
arXiv Detail & Related papers (2023-10-16T06:41:16Z)
AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems [112.76941157194544]
We propose AgentCF for simulating user-item interactions in recommender systems through agent-based collaborative filtering. We creatively consider not only users but also items as agents, and develop a collaborative learning approach that optimize both kinds of agents together. Overall, the optimized agents exhibit diverse interaction behaviors within our framework, including user-item, user-user, item-item, and collective interactions.
arXiv Detail & Related papers (2023-10-13T16:37:14Z)
MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
MADiff is a diffusion-based multi-agent learning framework. It works as both a decentralized policy and a centralized controller. Our experiments demonstrate that MADiff outperforms baseline algorithms across various multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z)
Behavioral Cloning via Search in Video PreTraining Latent Space [0.13999481573773073]
We formulate our control problem as a search problem over a dataset of experts' demonstrations. We perform a proximity search over the BASALT MineRL-dataset in the latent representation of a Video PreTraining model. The agent copies the actions from the expert trajectory as long as the distance between the state representations of the agent and the selected expert trajectory from the dataset do not diverge.
arXiv Detail & Related papers (2022-12-27T00:20:37Z)
Explaining Reinforcement Learning Policies through Counterfactual Trajectories [147.7246109100945]
A human developer must validate that an RL agent will perform well at test-time. Our method conveys how the agent performs under distribution shifts by showing the agent's behavior across a wider trajectory distribution. In a user study, we demonstrate that our method enables users to score better than baseline methods on one of two agent validation tasks.
arXiv Detail & Related papers (2022-01-29T00:52:37Z)
PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD) We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z)
Variational Autoencoders for Opponent Modeling in Multi-Agent Systems [9.405879323049659]
Multi-agent systems exhibit complex behaviors that emanate from the interactions of multiple agents in a shared environment. In this work, we are interested in controlling one agent in a multi-agent system and successfully learn to interact with the other agents that have fixed policies. Modeling the behavior of other agents (opponents) is essential in understanding the interactions of the agents in the system.
arXiv Detail & Related papers (2020-01-29T13:38:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.