Related papers: HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

URL: http://arxiv.org/abs/2303.17580v4
Date: Sun, 3 Dec 2023 18:17:21 GMT
Title: HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
Authors: Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang
Abstract summary: Large language models (LLMs) have exhibited exceptional abilities in language understanding, generation, interaction, and reasoning. LLMs could act as a controller to manage existing AI models to solve complicated AI tasks. We present HuggingGPT, an LLM-powered agent that connects various AI models in machine learning communities.
Score: 85.25054021362232
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Solving complicated AI tasks with different domains and modalities is a key step toward artificial general intelligence. While there are numerous AI models available for various domains and modalities, they cannot handle complicated AI tasks autonomously. Considering large language models (LLMs) have exhibited exceptional abilities in language understanding, generation, interaction, and reasoning, we advocate that LLMs could act as a controller to manage existing AI models to solve complicated AI tasks, with language serving as a generic interface to empower this. Based on this philosophy, we present HuggingGPT, an LLM-powered agent that leverages LLMs (e.g., ChatGPT) to connect various AI models in machine learning communities (e.g., Hugging Face) to solve AI tasks. Specifically, we use ChatGPT to conduct task planning when receiving a user request, select models according to their function descriptions available in Hugging Face, execute each subtask with the selected AI model, and summarize the response according to the execution results. By leveraging the strong language capability of ChatGPT and abundant AI models in Hugging Face, HuggingGPT can tackle a wide range of sophisticated AI tasks spanning different modalities and domains and achieve impressive results in language, vision, speech, and other challenging tasks, which paves a new way towards the realization of artificial general intelligence.

Related papers

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks [52.46737975742287]
We build a self-contained environment with data that mimics a small software company environment. We find that with the most competitive agent, 24% of the tasks can be completed autonomously. This paints a nuanced picture on task automation with LM agents.
arXiv Detail & Related papers (2024-12-18T18:55:40Z)
Two Heads Are Better Than One: Collaborative LLM Embodied Agents for Human-Robot Interaction [1.6574413179773757]
Large language models (LLMs) should be able to leverage their large breadth of understanding to interpret natural language commands. However, these models suffer from hallucinations, which may cause safety issues or deviations from the task. In this research, multiple collaborative AI systems were tested against a single independent AI agent to determine whether the success in other domains would translate into improved human-robot interaction performance.
arXiv Detail & Related papers (2024-11-23T02:47:12Z)
A Survey on Vision-Language-Action Models for Embodied AI [71.16123093739932]
Vision-language-action models (VLAs) have become a foundational element in robot learning. Various methods have been proposed to enhance traits such as versatility, dexterity, and generalizability. VLAs serve as high-level task planners capable of decomposing long-horizon tasks into executable subtasks.
arXiv Detail & Related papers (2024-05-23T01:43:54Z)
SAI: Solving AI Tasks with Systematic Artificial Intelligence in Communication Network [4.302209772725456]
Systematic Artificial Intelligence (SAI) is a framework designed to solve AI tasks by leveraging Large Language Models (LLMs) and intent-format-based input. SAI can complete numerous complex AI tasks in the communication network, achieving impressive results in network optimization, resource allocation, and other challenging tasks.
arXiv Detail & Related papers (2023-10-13T12:14:58Z)
Large Language Models Empowered Autonomous Edge AI for Connected Intelligence [51.269276328087855]
Edge artificial intelligence (Edge AI) is a promising solution to achieve connected intelligence. This article presents a vision of autonomous edge AI systems that automatically organize, adapt, and optimize themselves to meet users' diverse requirements.
arXiv Detail & Related papers (2023-07-06T05:16:55Z)
Prompt Sapper: A LLM-Empowered Production Tool for Building AI Chains [31.080896878139402]
We propose the concept of AI chain and introduce the best principles and practices that have been accumulated in software engineering for decades into AI chain engineering. We also develop a no-code integrated development environment, Prompt Sapper, which embodies these AI chain engineering principles and patterns naturally in the process of building AI chains.
arXiv Detail & Related papers (2023-06-21T05:31:00Z)
AutoML-GPT: Automatic Machine Learning with GPT [74.30699827690596]
We propose developing task-oriented prompts and automatically utilizing large language models (LLMs) to automate the training pipeline. We present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyper parameters. This approach achieves remarkable results in computer vision, natural language processing, and other challenging areas.
arXiv Detail & Related papers (2023-05-04T02:09:43Z)
OpenAGI: When LLM Meets Domain Experts [51.86179657467822]
Human Intelligence (HI) excels at combining basic skills to solve complex tasks. This capability is vital for Artificial Intelligence (AI) and should be embedded in comprehensive AI Agents. We introduce OpenAGI, an open-source platform designed for solving multi-step, real-world tasks.
arXiv Detail & Related papers (2023-04-10T03:55:35Z)
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs [71.7495056818522]
We introduce TaskMatrix.AI as a new AI ecosystem that connects foundation models with millions of APIs for task completion. We will present our vision of how to build such an ecosystem, explain each key component, and use study cases to illustrate both the feasibility of this vision and the main challenges we need to address next.
arXiv Detail & Related papers (2023-03-29T03:30:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.