Language-Conditioned Semantic Search-Based Policy for Robotic
Manipulation Tasks
- URL: http://arxiv.org/abs/2312.05925v1
- Date: Sun, 10 Dec 2023 16:17:00 GMT
- Title: Language-Conditioned Semantic Search-Based Policy for Robotic
Manipulation Tasks
- Authors: Jannik Sheikh, Andrew Melnik, Gora Chand Nandi, Robert Haschke
- Abstract summary: We propose a language-conditioned semantic search-based method to produce an online search-based policy.
Our approach surpasses the performance of the baselines on the CALVIN benchmark and exhibits strong zero-shot adaptation capabilities.
- Score: 2.1332830068386217
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Reinforcement learning and Imitation Learning approaches utilize policy
learning strategies that are difficult to generalize well with just a few
examples of a task. In this work, we propose a language-conditioned semantic
search-based method to produce an online search-based policy from the available
demonstration dataset of state-action trajectories. Here we directly acquire
actions from the most similar manipulation trajectories found in the dataset.
Our approach surpasses the performance of the baselines on the CALVIN benchmark
and exhibits strong zero-shot adaptation capabilities. This holds great
potential for expanding the use of our online search-based policy approach to
tasks typically addressed by Imitation Learning or Reinforcement Learning-based
policies.
Related papers
- Representation-Driven Reinforcement Learning [57.44609759155611]
We present a representation-driven framework for reinforcement learning.
By representing policies as estimates of their expected values, we leverage techniques from contextual bandits to guide exploration and exploitation.
We demonstrate the effectiveness of this framework through its application to evolutionary and policy gradient-based approaches.
arXiv Detail & Related papers (2023-05-31T14:59:12Z) - Goal-Conditioned Imitation Learning using Score-based Diffusion Policies [3.49482137286472]
We propose a new policy representation based on score-based diffusion models (SDMs)
We apply our new policy representation in the domain of Goal-Conditioned Imitation Learning (GCIL)
We show how BESO can even be used to learn a goal-independent policy from play-data usingintuitive-free guidance.
arXiv Detail & Related papers (2023-04-05T15:52:34Z) - Robust Task Representations for Offline Meta-Reinforcement Learning via
Contrastive Learning [21.59254848913971]
offline meta-reinforcement learning is a reinforcement learning paradigm that learns from offline data to adapt to new tasks.
We propose a contrastive learning framework for task representations that are robust to the distribution of behavior policies in training and test.
Experiments on a variety of offline meta-reinforcement learning benchmarks demonstrate the advantages of our method over prior methods.
arXiv Detail & Related papers (2022-06-21T14:46:47Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Programmatic Policy Extraction by Iterative Local Search [0.15229257192293197]
We present a simple and direct approach to extracting a programmatic policy from a pretrained neural policy.
Both when trained using a hand crafted expert policy and a learned neural policy, our method discovers simple and interpretable policies that perform almost as well as the original.
arXiv Detail & Related papers (2022-01-18T10:39:40Z) - Meta Navigator: Search for a Good Adaptation Policy for Few-shot
Learning [113.05118113697111]
Few-shot learning aims to adapt knowledge learned from previous tasks to novel tasks with only a limited amount of labeled data.
Research literature on few-shot learning exhibits great diversity, while different algorithms often excel at different few-shot learning scenarios.
We present Meta Navigator, a framework that attempts to solve the limitation in few-shot learning by seeking a higher-level strategy.
arXiv Detail & Related papers (2021-09-13T07:20:01Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z) - Reward-Conditioned Policies [100.64167842905069]
imitation learning requires near-optimal expert data.
Can we learn effective policies via supervised learning without demonstrations?
We show how such an approach can be derived as a principled method for policy search.
arXiv Detail & Related papers (2019-12-31T18:07:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.