LLMind: Orchestrating AI and IoT with LLM for Complex Task Execution
- URL: http://arxiv.org/abs/2312.09007v3
- Date: Tue, 20 Feb 2024 13:02:10 GMT
- Title: LLMind: Orchestrating AI and IoT with LLM for Complex Task Execution
- Authors: Hongwei Cui and Yuyang Du and Qun Yang and Yulin Shao and Soung Chang
Liew
- Abstract summary: We present LLMind, an AI agent framework that enables effective collaboration among IoT devices for executing complex tasks.
Inspired by the functional specialization theory of the brain, our framework integrates an LLM with domain-specific AI modules, enhancing its capabilities.
- Score: 20.186752447895994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The exploration of large language models (LLMs) for task planning and IoT
automation has recently gained significant attention. However, existing works
suffer from limitations in terms of resource accessibility, complex task
planning, and efficiency. In this paper, we present LLMind, an LLM-based AI
agent framework that enables effective collaboration among IoT devices for
executing complex tasks. Inspired by the functional specialization theory of
the brain, our framework integrates an LLM with domain-specific AI modules,
enhancing its capabilities. Complex tasks, which may involve collaborations of
multiple domain-specific AI modules and IoT devices, are executed through a
control script generated by the LLM using a Language-Code transformation
approach, which first converts language descriptions to an intermediate
finite-state machine (FSM) before final precise transformation to code.
Furthermore, the framework incorporates a novel experience accumulation
mechanism to enhance response speed and effectiveness, allowing the framework
to evolve and become progressively sophisticated through continuing user and
machine interactions.
Related papers
- ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning [74.58666091522198]
We present a framework for intuitive robot programming by non-experts.
We leverage natural language prompts and contextual information from the Robot Operating System (ROS)
Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface.
arXiv Detail & Related papers (2024-06-28T08:28:38Z) - Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks.
We propose a text-based generative IoT (GIoT) system deployed in the local network setting.
arXiv Detail & Related papers (2024-06-14T19:24:00Z) - When Large Language Models Meet Optical Networks: Paving the Way for Automation [17.4503217818141]
We propose a framework of LLM-empowered optical networks, facilitating intelligent control of the physical layer and efficient interaction with the application layer.
The proposed framework is verified on two typical tasks: network alarm analysis and network performance optimization.
The good response accuracies and sematic similarities of 2,400 test situations exhibit the great potential of LLM in optical networks.
arXiv Detail & Related papers (2024-05-14T10:46:33Z) - From Language Models to Practical Self-Improving Computer Agents [0.8547032097715571]
We develop a methodology to create AI computer agents that can carry out diverse computer tasks and self-improve.
We prompt an LLM agent to augment itself with retrieval, internet search, web navigation, and text editor capabilities.
The agent effectively uses these various tools to solve problems including automated software development and web-based tasks.
arXiv Detail & Related papers (2024-04-18T07:50:10Z) - On the Multi-turn Instruction Following for Conversational Web Agents [83.51251174629084]
We introduce a new task of Conversational Web Navigation, which necessitates sophisticated interactions that span multiple turns with both the users and the environment.
We propose a novel framework, named self-reflective memory-augmented planning (Self-MAP), which employs memory utilization and self-reflection techniques.
arXiv Detail & Related papers (2024-02-23T02:18:12Z) - Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents [39.53593677934238]
Large Language Models (LLMs) enable AI Agents to automatically generate and execute multi-step plans to solve complex tasks.
However, current LLM-based agents frequently generate invalid or non-executable plans.
This paper proposes a novel Formal-LLM'' framework for LLM-based agents by integrating the expressiveness of natural language and the precision of formal language.
arXiv Detail & Related papers (2024-02-01T17:30:50Z) - Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs.
We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer.
This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z) - LanguageMPC: Large Language Models as Decision Makers for Autonomous
Driving [87.1164964709168]
This work employs Large Language Models (LLMs) as a decision-making component for complex autonomous driving scenarios.
Extensive experiments demonstrate that our proposed method not only consistently surpasses baseline approaches in single-vehicle tasks, but also helps handle complex driving behaviors even multi-vehicle coordination.
arXiv Detail & Related papers (2023-10-04T17:59:49Z) - LLM-Based Human-Robot Collaboration Framework for Manipulation Tasks [4.4589894340260585]
This paper presents a novel approach to enhance autonomous robotic manipulation using the Large Language Model (LLM) for logical inference.
The proposed system combines the advantage of LLM with YOLO-based environmental perception to enable robots to autonomously make reasonable decisions.
arXiv Detail & Related papers (2023-08-29T01:54:49Z) - TPTU: Large Language Model-based AI Agents for Task Planning and Tool
Usage [28.554981886052953]
Large Language Models (LLMs) have emerged as powerful tools for various real-world applications.
Despite their prowess, intrinsic generative abilities of LLMs may prove insufficient for handling complex tasks.
This paper proposes a structured framework tailored for LLM-based AI Agents.
arXiv Detail & Related papers (2023-08-07T09:22:03Z) - Low-code LLM: Graphical User Interface over Large Language Models [115.08718239772107]
This paper introduces a novel human-LLM interaction framework, Low-code LLM.
It incorporates six types of simple low-code visual programming interactions to achieve more controllable and stable responses.
We highlight three advantages of the low-code LLM: user-friendly interaction, controllable generation, and wide applicability.
arXiv Detail & Related papers (2023-04-17T09:27:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.