AgentTuning: Enabling Generalized Agent Abilities for LLMs
- URL: http://arxiv.org/abs/2310.12823v2
- Date: Sun, 22 Oct 2023 16:19:16 GMT
- Title: AgentTuning: Enabling Generalized Agent Abilities for LLMs
- Authors: Aohan Zeng, Mingdao Liu, Rui Lu, Bowen Wang, Xiao Liu, Yuxiao Dong,
Jie Tang
- Abstract summary: We present AgentTuning, a simple and general method to enhance the agent abilities of open large language models.
We employ a hybrid instruction-tuning strategy by combining AgentInstruct with open-source instructions from general domains.
Our evaluations show that AgentTuning enables LLMs' agent capabilities without compromising general abilities.
- Score: 35.74502545364593
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open large language models (LLMs) with great performance in various tasks
have significantly advanced the development of LLMs. However, they are far
inferior to commercial models such as ChatGPT and GPT-4 when acting as agents
to tackle complex tasks in the real world. These agent tasks employ LLMs as the
central controller responsible for planning, memorization, and tool
utilization, necessitating both fine-grained prompting methods and robust LLMs
to achieve satisfactory performance. Though many prompting methods have been
proposed to complete particular agent tasks, there is lack of research focusing
on improving the agent capabilities of LLMs themselves without compromising
their general abilities. In this work, we present AgentTuning, a simple and
general method to enhance the agent abilities of LLMs while maintaining their
general LLM capabilities. We construct AgentInstruct, a lightweight
instruction-tuning dataset containing high-quality interaction trajectories. We
employ a hybrid instruction-tuning strategy by combining AgentInstruct with
open-source instructions from general domains. AgentTuning is used to
instruction-tune the Llama 2 series, resulting in AgentLM. Our evaluations show
that AgentTuning enables LLMs' agent capabilities without compromising general
abilities. The AgentLM-70B is comparable to GPT-3.5-turbo on unseen agent
tasks, demonstrating generalized agent capabilities. We open source the
AgentInstruct and AgentLM-7B, 13B, and 70B models at
https://github.com/THUDM/AgentTuning, serving open and powerful alternatives to
commercial LLMs for agent tasks.
Related papers
- AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents [52.13695464678006]
This study enhances an LLM-based web agent by simply refining its observation and action space.
AgentOccam surpasses the previous state-of-the-art and concurrent work by 9.8 (+29.4%) and 5.9 (+15.8%) absolute points respectively.
arXiv Detail & Related papers (2024-10-17T17:50:38Z) - AGILE: A Novel Reinforcement Learning Framework of LLM Agents [7.982249117182315]
We introduce a novel reinforcement learning framework of LLM agents designed to perform complex conversational tasks with users.
The agent possesses capabilities beyond conversation, including reflection, tool usage, and expert consultation.
Our experiments show that AGILE agents based on 7B and 13B LLMs trained with PPO can outperform GPT-4 agents.
arXiv Detail & Related papers (2024-05-23T16:17:44Z) - Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning [56.82041895921434]
Open-source pre-trained Large Language Models (LLMs) exhibit strong language understanding and generation capabilities.
When used as agents for dealing with complex problems in the real world, their performance is far inferior to large commercial models such as ChatGPT and GPT-4.
arXiv Detail & Related papers (2024-03-29T03:48:12Z) - Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models [56.00992369295851]
Open-sourced Large Language Models (LLMs) have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents.
This paper delivers three key observations: (1) the current agent training corpus is entangled with both formats following and agent reasoning, which significantly shifts from the distribution of its pre-training data; (2) LLMs exhibit different learning speeds on the capabilities required by agent tasks; and (3) current approaches have side-effects when improving agent abilities by introducing hallucinations.
We propose Agent-FLAN to effectively Fine-tune LANguage models for Agents.
arXiv Detail & Related papers (2024-03-19T16:26:10Z) - EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents [65.38474102119181]
We propose EnvGen, a framework to adaptively create training environments.
We train a small RL agent in a mixture of the original and LLM-generated environments.
We find that a small RL agent trained with EnvGen can outperform SOTA methods, including a GPT-4 agent, and learns long-horizon tasks significantly faster.
arXiv Detail & Related papers (2024-03-18T17:51:16Z) - AgentLite: A Lightweight Library for Building and Advancing
Task-Oriented LLM Agent System [91.41155892086252]
We open-source a new AI agent library, AgentLite, which simplifies research investigation into LLM agents.
AgentLite is a task-oriented framework designed to enhance the ability of agents to break down tasks.
We introduce multiple practical applications developed with AgentLite to demonstrate its convenience and flexibility.
arXiv Detail & Related papers (2024-02-23T06:25:20Z) - Offline Training of Language Model Agents with Functions as Learnable Weights [39.88545362699836]
We present a novel paradigm of training Large Language Models (LLMs) agents without modifying the LLM weights.
We develop Agentr that employs the LLM to update agents' functions and devise an agent training algorithm with two strategies, roll-back, and early-stop.
With extensive experiments, we showcase that the agent training paradigm could significantly improve the performance of representative LLM agents.
arXiv Detail & Related papers (2024-02-17T18:31:21Z) - AgentBench: Evaluating LLMs as Agents [88.45506148281379]
Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks.
We present AgentBench, a benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities.
arXiv Detail & Related papers (2023-08-07T16:08:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.