From Language Models to Practical Self-Improving Computer Agents
- URL: http://arxiv.org/abs/2404.11964v1
- Date: Thu, 18 Apr 2024 07:50:10 GMT
- Title: From Language Models to Practical Self-Improving Computer Agents
- Authors: Alex Sheng,
- Abstract summary: We develop a methodology to create AI computer agents that can carry out diverse computer tasks and self-improve.
We prompt an LLM agent to augment itself with retrieval, internet search, web navigation, and text editor capabilities.
The agent effectively uses these various tools to solve problems including automated software development and web-based tasks.
- Score: 0.8547032097715571
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a simple and straightforward methodology to create AI computer agents that can carry out diverse computer tasks and self-improve by developing tools and augmentations to enable themselves to solve increasingly complex tasks. As large language models (LLMs) have been shown to benefit from non-parametric augmentations, a significant body of recent work has focused on developing software that augments LLMs with various capabilities. Rather than manually developing static software to augment LLMs through human engineering effort, we propose that an LLM agent can systematically generate software to augment itself. We show, through a few case studies, that a minimal querying loop with appropriate prompt engineering allows an LLM to generate and use various augmentations, freely extending its own capabilities to carry out real-world computer tasks. Starting with only terminal access, we prompt an LLM agent to augment itself with retrieval, internet search, web navigation, and text editor capabilities. The agent effectively uses these various tools to solve problems including automated software development and web-based tasks.
Related papers
- Agentless: Demystifying LLM-based Software Engineering Agents [12.19683999553113]
We build Agentless -- an agentless approach to automatically solve software development problems.
Compared to the verbose and complex setup of agent-based approaches, Agentless employs a simplistic three-phase process of localization, repair, and patch validation.
Our results on the popular SWE-bench Lite benchmark show that surprisingly the simplistic Agentless is able to achieve both the highest performance and low cost.
arXiv Detail & Related papers (2024-07-01T17:24:45Z) - SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [79.07755560048388]
SWE-agent is a system that facilitates LM agents to autonomously use computers to solve software engineering tasks.
SWE-agent's custom agent-computer interface (ACI) significantly enhances an agent's ability to create and edit code files, navigate entire repositories, and execute tests and other programs.
We evaluate SWE-agent on SWE-bench and HumanEvalFix, achieving state-of-the-art performance on both with a pass@1 rate of 12.5% and 87.7%, respectively.
arXiv Detail & Related papers (2024-05-06T17:41:33Z) - AgentLite: A Lightweight Library for Building and Advancing
Task-Oriented LLM Agent System [91.41155892086252]
We open-source a new AI agent library, AgentLite, which simplifies research investigation into LLM agents.
AgentLite is a task-oriented framework designed to enhance the ability of agents to break down tasks.
We introduce multiple practical applications developed with AgentLite to demonstrate its convenience and flexibility.
arXiv Detail & Related papers (2024-02-23T06:25:20Z) - Offline Training of Language Model Agents with Functions as Learnable Weights [39.88545362699836]
We present a novel paradigm of training Large Language Models (LLMs) agents without modifying the LLM weights.
We develop Agentr that employs the LLM to update agents' functions and devise an agent training algorithm with two strategies, roll-back, and early-stop.
With extensive experiments, we showcase that the agent training paradigm could significantly improve the performance of representative LLM agents.
arXiv Detail & Related papers (2024-02-17T18:31:21Z) - Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security [34.67477557318947]
We focus on Personal LLM Agents, which are LLM-based agents that are deeply integrated with personal data and personal devices.
We envision that Personal LLM Agents will become a major software paradigm for end-users in the upcoming era.
arXiv Detail & Related papers (2024-01-10T09:25:45Z) - Experiential Co-Learning of Software-Developing Agents [83.34027623428096]
Large language models (LLMs) have brought significant changes to various domains, especially in software development.
We introduce Experiential Co-Learning, a novel LLM-agent learning framework.
Experiments demonstrate that the framework enables agents to tackle unseen software-developing tasks more effectively.
arXiv Detail & Related papers (2023-12-28T13:50:42Z) - LLMind: Orchestrating AI and IoT with LLM for Complex Task Execution [18.816077341295628]
We present LLMind, a task-oriented AI framework that enables effective collaboration among IoT devices.
Inspired by the functional specialization theory of the brain, our framework integrates an LLM with domain-specific AI modules.
Complex tasks, which may involve collaborations of multiple domain-specific AI modules and IoT devices, are executed through a control script.
arXiv Detail & Related papers (2023-12-14T14:57:58Z) - CRAFT: Customizing LLMs by Creating and Retrieving from Specialized
Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs)
It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z) - AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation [61.455159391215915]
AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks.
AutoGen agents are customizable, conversable, and can operate in various modes that employ combinations of LLMs, human inputs, and tools.
arXiv Detail & Related papers (2023-08-16T05:57:52Z) - TPTU: Large Language Model-based AI Agents for Task Planning and Tool
Usage [28.554981886052953]
Large Language Models (LLMs) have emerged as powerful tools for various real-world applications.
Despite their prowess, intrinsic generative abilities of LLMs may prove insufficient for handling complex tasks.
This paper proposes a structured framework tailored for LLM-based AI Agents.
arXiv Detail & Related papers (2023-08-07T09:22:03Z) - CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models [74.22729793816451]
Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability.
We propose CREATOR, a novel framework that enables LLMs to create their own tools using documentation and code realization.
We evaluate CREATOR on MATH and TabMWP benchmarks, respectively consisting of challenging math competition problems.
arXiv Detail & Related papers (2023-05-23T17:51:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.