Intrinsic Language-Guided Exploration for Complex Long-Horizon Robotic
Manipulation Tasks
- URL: http://arxiv.org/abs/2309.16347v2
- Date: Thu, 7 Mar 2024 17:53:35 GMT
- Title: Intrinsic Language-Guided Exploration for Complex Long-Horizon Robotic
Manipulation Tasks
- Authors: Eleftherios Triantafyllidis, Filippos Christianos and Zhibin Li
- Abstract summary: Current reinforcement learning algorithms struggle in sparse and complex environments.
We propose the Intrinsically Guided Exploration from Large Language Models (IGE-LLMs) framework.
- Score: 12.27904219271791
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current reinforcement learning algorithms struggle in sparse and complex
environments, most notably in long-horizon manipulation tasks entailing a
plethora of different sequences. In this work, we propose the Intrinsically
Guided Exploration from Large Language Models (IGE-LLMs) framework. By
leveraging LLMs as an assistive intrinsic reward, IGE-LLMs guides the
exploratory process in reinforcement learning to address intricate long-horizon
with sparse rewards robotic manipulation tasks. We evaluate our framework and
related intrinsic learning methods in an environment challenged with
exploration, and a complex robotic manipulation task challenged by both
exploration and long-horizons. Results show IGE-LLMs (i) exhibit notably higher
performance over related intrinsic methods and the direct use of LLMs in
decision-making, (ii) can be combined and complement existing learning methods
highlighting its modularity, (iii) are fairly insensitive to different
intrinsic scaling parameters, and (iv) maintain robustness against increased
levels of uncertainty and horizons.
Related papers
- LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning [22.99690700210957]
We propose a novel HRL framework that leverages language instructions to generate a stationary reward function for a higher-level policy.
Since the language-guided reward is unaffected by the lower primitive behaviour, LGR2 mitigates non-stationarity.
Our approach attains success rates exceeding 70$%$ in challenging, sparse-reward robotic navigation and manipulation environments.
arXiv Detail & Related papers (2024-06-09T18:40:24Z) - Benchmarking General-Purpose In-Context Learning [19.40952728849431]
In-context learning (ICL) empowers generative models to address new tasks effectively and efficiently on the fly.
In this paper, we study extending ICL to address a broader range of tasks with an extended learning horizon and higher improvement potential.
arXiv Detail & Related papers (2024-05-27T14:50:42Z) - Variational Offline Multi-agent Skill Discovery [43.869625428099425]
We propose two novel auto-encoder schemes to simultaneously capture subgroup- and temporal-level abstractions and form multi-agent skills.
Our method can be applied to offline multi-task data, and the discovered subgroup skills can be transferred across relevant tasks without retraining.
arXiv Detail & Related papers (2024-05-26T00:24:46Z) - Can large language models explore in-context? [87.49311128190143]
We deploy Large Language Models as agents in simple multi-armed bandit environments.
We find that the models do not robustly engage in exploration without substantial interventions.
arXiv Detail & Related papers (2024-03-22T17:50:43Z) - From Summary to Action: Enhancing Large Language Models for Complex
Tasks with Open World APIs [62.496139001509114]
We introduce a novel tool invocation pipeline designed to control massive real-world APIs.
This pipeline mirrors the human task-solving process, addressing complicated real-life user queries.
Empirical evaluations of our Sum2Act pipeline on the ToolBench benchmark show significant performance improvements.
arXiv Detail & Related papers (2024-02-28T08:42:23Z) - RObotic MAnipulation Network (ROMAN) $\unicode{x2013}$ Hybrid
Hierarchical Learning for Solving Complex Sequential Tasks [70.69063219750952]
We present a Hybrid Hierarchical Learning framework, the Robotic Manipulation Network (ROMAN)
ROMAN achieves task versatility and robust failure recovery by integrating behavioural cloning, imitation learning, and reinforcement learning.
Experimental results show that by orchestrating and activating these specialised manipulation experts, ROMAN generates correct sequential activations for accomplishing long sequences of sophisticated manipulation tasks.
arXiv Detail & Related papers (2023-06-30T20:35:22Z) - Semantically Aligned Task Decomposition in Multi-Agent Reinforcement
Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA)
SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning.
SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z) - Efficient Learning of High Level Plans from Play [57.29562823883257]
We present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL.
We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks.
arXiv Detail & Related papers (2023-03-16T20:09:47Z) - Efficient Reinforcement Learning in Block MDPs: A Model-free
Representation Learning Approach [73.62265030773652]
We present BRIEE, an algorithm for efficient reinforcement learning in Markov Decision Processes with block-structured dynamics.
BRIEE interleaves latent states discovery, exploration, and exploitation together, and can provably learn a near-optimal policy.
We show that BRIEE is more sample efficient than the state-of-art Block MDP algorithm HOMER RL and other empirical baselines on challenging rich-observation combination lock problems.
arXiv Detail & Related papers (2022-01-31T19:47:55Z) - Overcoming Exploration: Deep Reinforcement Learning in Complex
Environments from Temporal Logic Specifications [2.8904578737516764]
We present a Deep Reinforcement Learning (DRL) algorithm for a task-guided robot with unknown continuous-time dynamics deployed in a large-scale complex environment.
Our framework is shown to significantly improve performance (effectiveness, efficiency) and exploration of robots tasked with complex missions in large-scale complex environments.
arXiv Detail & Related papers (2022-01-28T16:39:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.