Related papers: Building Open-Ended Embodied Agent via Language-Policy Bidirectional Adaptation

Building Open-Ended Embodied Agent via Language-Policy Bidirectional Adaptation

URL: http://arxiv.org/abs/2401.00006v3
Date: Tue, 6 Feb 2024 16:30:55 GMT
Title: Building Open-Ended Embodied Agent via Language-Policy Bidirectional Adaptation
Authors: Shaopeng Zhai, Jie Wang, Tianyi Zhang, Fuxian Huang, Qi Zhang, Ming Zhou, Jing Hou, Yu Qiao and Yu Liu
Abstract summary: Building embodied agents on integrating Large Language Models (LLMs) and Reinforcement Learning (RL) have revolutionized human-AI interaction. Existing research faces challenges in meeting the requirement of open-endedness. We present OpenPAL, a co-training framework comprising two stages: fine-tuning a pre-trained LLM to translate human instructions into goals for planning, and goal-conditioned training a policy for decision-making.
Score: 40.82919989450566
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Building embodied agents on integrating Large Language Models (LLMs) and Reinforcement Learning (RL) have revolutionized human-AI interaction: researchers can now leverage language instructions to plan decision-making for open-ended tasks. However, existing research faces challenges in meeting the requirement of open-endedness. They typically either train LLM/RL models to adapt to a fixed counterpart, limiting exploration of novel skills and hindering the efficacy of human-AI interaction. To this end, we present OpenPAL, a co-training framework comprising two stages: (1) fine-tuning a pre-trained LLM to translate human instructions into goals for planning, and goal-conditioned training a policy for decision-making; (2) co-training to align the LLM and policy, achieving instruction open-endedness. We conducted experiments using Contra, an open-ended FPS game, demonstrating that an agent trained with OpenPAL not only comprehends arbitrary instructions but also exhibits efficient execution. These results suggest that OpenPAL holds the potential to construct open-ended embodied agents in practical scenarios.

Related papers

Beyond Syntax: Action Semantics Learning for App Agents [60.56331102288794]
Action Semantics Learning (ASL) is a learning framework where the learning objective is capturing the semantics of the ground truth actions.<n>ASL significantly improves the accuracy and generalisation of App agents over existing methods.
arXiv Detail & Related papers (2025-06-21T12:08:19Z)
Training LLM-Based Agents with Synthetic Self-Reflected Trajectories and Partial Masking [61.61356842567952]
We propose STeP, a novel method for improving LLM-based agent training.<n>We synthesize self-reflected trajectories that include reflections and corrections of error steps.<n>Experiments demonstrate that our method improves agent performance across three representative tasks.
arXiv Detail & Related papers (2025-05-26T14:11:12Z)
Continuous Learning Conversational AI: A Personalized Agent Framework via A2C Reinforcement Learning [0.0]
This paper introduces a Continuous Learning Conversational AI (CLCA) approach, implemented using A2C reinforcement learning.<n>We use simulated sales dialogues, generated by Large Language Models (LLMs), to train an A2C agent.<n>This agent learns to optimize conversation strategies for personalization, focusing on engagement and delivering value.
arXiv Detail & Related papers (2025-02-18T14:05:59Z)
PIANIST: Learning Partially Observable World Models with LLMs for Multi-Agent Decision Making [30.46033960436517]
We propose a framework PIANIST for decomposing the world model into seven intuitive components. We show that our method works well on two different games that challenge the planning and decision making skills of the agent.
arXiv Detail & Related papers (2024-11-24T22:36:34Z)
Automating Knowledge Discovery from Scientific Literature via LLMs: A Dual-Agent Approach with Progressive Ontology Prompting [59.97247234955861]
We introduce a novel framework based on large language models (LLMs) that combines a progressive prompting algorithm with a dual-agent system, named LLM-Duo. Our method identifies 2,421 interventions from 64,177 research articles in the speech-language therapy domain.
arXiv Detail & Related papers (2024-08-20T16:42:23Z)
Personalized Wireless Federated Learning for Large Language Models [75.22457544349668]
Large Language Models (LLMs) have revolutionized natural language processing tasks. Their deployment in wireless networks still face challenges, i.e., a lack of privacy and security protection mechanisms. We introduce two personalized wireless federated fine-tuning methods with low communication overhead.
arXiv Detail & Related papers (2024-04-20T02:30:21Z)
DECIDER: A Dual-System Rule-Controllable Decoding Framework for Language Generation [57.07295906718989]
Constrained decoding approaches aim to control the meaning or style of text generated by pre-trained large language (Ms also PLMs) for various tasks at inference time.<n>These methods often guide plausible continuations by greedily and explicitly selecting targets.<n>Inspired by cognitive dual-process theory, we propose a novel decoding framework DECIDER.
arXiv Detail & Related papers (2024-03-04T11:49:08Z)
Large Language Models as Agents in Two-Player Games [12.303405412105187]
This paper delineates the parallels between the training methods of large language models (LLMs) and the strategies employed for the development of agents in two-player games. We propose a re-conceptualization of LLM learning processes in terms of agent learning in language-based games.
arXiv Detail & Related papers (2024-02-12T21:44:32Z)
GLIDE-RL: Grounded Language Instruction through DEmonstration in RL [7.658523833511356]
Training efficient Reinforcement Learning (RL) agents grounded in natural language has been a long-standing challenge. We present a novel algorithm, Grounded Language Instruction through DEmonstration in RL (GLIDE-RL) that introduces a teacher-instructor-student curriculum learning framework. In this multi-agent framework, the teacher and the student agents learn simultaneously based on the student's current skill level.
arXiv Detail & Related papers (2024-01-03T17:32:13Z)
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs) Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z)
Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks. However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome. In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z)
Learning to Solve Voxel Building Embodied Tasks from Pixels and Natural Language Instructions [53.21504989297547]
We propose a new method that combines a language model and reinforcement learning for the task of building objects in a Minecraft-like environment. Our method first generates a set of consistently achievable sub-goals from the instructions and then completes associated sub-tasks with a pre-trained RL policy.
arXiv Detail & Related papers (2022-11-01T18:30:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.