Related papers: Behavioral Cloning via Search in Embedded Demonstration Dataset

Behavioral Cloning via Search in Embedded Demonstration Dataset

URL: http://arxiv.org/abs/2306.09082v1
Date: Thu, 15 Jun 2023 12:25:41 GMT
Title: Behavioral Cloning via Search in Embedded Demonstration Dataset
Authors: Federico Malato, Florian Leopold, Ville Hautamaki, Andrew Melnik
Abstract summary: Behavioural cloning uses a dataset of demonstrations to learn a behavioural policy. We use latent space to index a demonstration dataset, instantly access similar relevant experiences, and copy behavior from these situations. Our approach can effectively recover meaningful demonstrations and show human-like behavior of an agent in the Minecraft environment.
Score: 0.15293427903448023
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Behavioural cloning uses a dataset of demonstrations to learn a behavioural policy. To overcome various learning and policy adaptation problems, we propose to use latent space to index a demonstration dataset, instantly access similar relevant experiences, and copy behavior from these situations. Actions from a selected similar situation can be performed by the agent until representations of the agent's current situation and the selected experience diverge in the latent space. Thus, we formulate our control problem as a search problem over a dataset of experts' demonstrations. We test our approach on BASALT MineRL-dataset in the latent representation of a Video PreTraining model. We compare our model to state-of-the-art Minecraft agents. Our approach can effectively recover meaningful demonstrations and show human-like behavior of an agent in the Minecraft environment in a wide variety of scenarios. Experimental results reveal that performance of our search-based approach is comparable to trained models, while allowing zero-shot task adaptation by changing the demonstration examples.

Related papers

Offline Action-Free Learning of Ex-BMDPs by Comparing Diverse Datasets [87.62730694973696]
This paper introduces CRAFT, a sample-efficient algorithm leveraging differences in controllable feature dynamics across agents to learn representations. We provide theoretical guarantees for CRAFT's performance and demonstrate its feasibility on a toy example.
arXiv Detail & Related papers (2025-03-26T22:05:57Z)
Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts. We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z)
Zero-shot Imitation Policy via Search in Demonstration Dataset [0.16817021284806563]
Behavioral cloning uses a dataset of demonstrations to learn a policy. We propose to use latent spaces of pre-trained foundation models to index a demonstration dataset. Our approach can effectively recover meaningful demonstrations and show human-like behavior of an agent in the Minecraft environment.
arXiv Detail & Related papers (2024-01-29T18:38:29Z)
Revisiting Demonstration Selection Strategies in In-Context Learning [66.11652803887284]
Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL) In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent. We propose a data- and model-dependent demonstration selection method, textbfTopK + ConE, based on the assumption that textitthe performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples.
arXiv Detail & Related papers (2024-01-22T16:25:27Z)
Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation [66.86987509942607]
We evaluate how such a paradigm should be done in imitation learning. We consider a setting where the pretraining corpus consists of multitask demonstrations. We argue that inverse dynamics modeling is well-suited to this setting.
arXiv Detail & Related papers (2023-05-26T14:40:46Z)
In-Context Demonstration Selection with Cross Entropy Difference [95.21947716378641]
Large language models (LLMs) can use in-context demonstrations to improve performance on zero-shot tasks. We present a cross-entropy difference (CED) method for selecting in-context demonstrations.
arXiv Detail & Related papers (2023-05-24T05:04:00Z)
Behavioral Cloning via Search in Video PreTraining Latent Space [0.13999481573773073]
We formulate our control problem as a search problem over a dataset of experts' demonstrations. We perform a proximity search over the BASALT MineRL-dataset in the latent representation of a Video PreTraining model. The agent copies the actions from the expert trajectory as long as the distance between the state representations of the agent and the selected expert trajectory from the dataset do not diverge.
arXiv Detail & Related papers (2022-12-27T00:20:37Z)
Out-of-Dynamics Imitation Learning from Multimodal Demonstrations [68.46458026983409]
We study out-of-dynamics imitation learning (OOD-IL), which relaxes the assumption to that the demonstrator and the imitator have the same state spaces. OOD-IL enables imitation learning to utilize demonstrations from a wide range of demonstrators but introduces a new challenge. We develop a better transferability measurement to tackle this newly-emerged challenge.
arXiv Detail & Related papers (2022-11-13T07:45:06Z)
Leveraging Demonstrations with Latent Space Priors [90.56502305574665]
We propose to leverage demonstration datasets by combining skill learning and sequence modeling. We show how to acquire such priors from state-only motion capture demonstrations and explore several methods for integrating them into policy learning. Our experimental results confirm that latent space priors provide significant gains in learning speed and final performance in a set of challenging sparse-reward environments.
arXiv Detail & Related papers (2022-10-26T13:08:46Z)
Robust Imitation of a Few Demonstrations with a Backwards Model [3.8530020696501794]
Behavior cloning of expert demonstrations can speed up learning optimal policies in a more sample-efficient way than reinforcement learning. We tackle this issue by extending the region of attraction around the demonstrations so that the agent can learn how to get back onto the demonstrated trajectories if it veers off-course. With optimal or near-optimal demonstrations, the learned policy will be both optimal and robust to deviations, with a wider region of attraction.
arXiv Detail & Related papers (2022-10-17T18:02:19Z)
Robust Maximum Entropy Behavior Cloning [15.713997170792842]
Imitation learning (IL) algorithms use expert demonstrations to learn a specific task. Most of the existing approaches assume that all expert demonstrations are reliable and trustworthy, but what if there exist some adversarial demonstrations among the given data-set? We propose a novel general frame-work to directly generate a policy from demonstrations that autonomously detect the adversarial demonstrations and exclude them from the data set.
arXiv Detail & Related papers (2021-01-04T22:08:46Z)
Stochastic Action Prediction for Imitation Learning [1.6385815610837169]
Imitation learning is a data-driven approach to acquiring skills that relies on expert demonstrations to learn a policy that maps observations to actions. We demonstrate inherentity in demonstrations collected for tasks including line following with a remote-controlled car. We find that accounting for adversariality in the expert data leads to substantial improvement in the success rate of task completion.
arXiv Detail & Related papers (2020-12-26T08:02:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.