Imitating Unknown Policies via Exploration
- URL: http://arxiv.org/abs/2008.05660v1
- Date: Thu, 13 Aug 2020 03:03:35 GMT
- Title: Imitating Unknown Policies via Exploration
- Authors: Nathan Gavenski and Juarez Monteiro and Roger Granada and Felipe
Meneguzzi and Rodrigo C. Barros
- Abstract summary: Behavioral cloning is an imitation learning technique that teaches an agent how to behave through expert demonstrations.
Recent approaches use self-supervision of fully-observable unlabeled snapshots of the states to decode state-pairs into actions.
We address these limitations incorporating a two-phase model into the original framework, which learns from unlabeled observations via exploration.
- Score: 18.78730427200346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Behavioral cloning is an imitation learning technique that teaches an agent
how to behave through expert demonstrations. Recent approaches use
self-supervision of fully-observable unlabeled snapshots of the states to
decode state-pairs into actions. However, the iterative learning scheme from
these techniques are prone to getting stuck into bad local minima. We address
these limitations incorporating a two-phase model into the original framework,
which learns from unlabeled observations via exploration, substantially
improving traditional behavioral cloning by exploiting (i) a sampling mechanism
to prevent bad local minima, (ii) a sampling mechanism to improve exploration,
and (iii) self-attention modules to capture global features. The resulting
technique outperforms the previous state-of-the-art in four different
environments by a large margin.
Related papers
- Perturb, Attend, Detect and Localize (PADL): Robust Proactive Image Defense [5.150608040339816]
We introduce PADL, a new solution able to generate image-specific perturbations using a symmetric scheme of encoding and decoding based on cross-attention.
Our method generalizes to a range of unseen models with diverse architectural designs, such as StarGANv2, BlendGAN, DiffAE, StableDiffusion and StableDiffusionXL.
arXiv Detail & Related papers (2024-09-26T15:16:32Z) - Exploiting Fine-Grained Prototype Distribution for Boosting Unsupervised Class Incremental Learning [13.17775851211893]
This paper explores a more challenging problem of unsupervised class incremental learning (UCIL)
The essence of addressing this problem lies in effectively capturing comprehensive feature representations and discovering unknown novel classes.
We propose a strategy to minimize overlap between novel and existing classes, thereby preserving historical knowledge and mitigating the phenomenon of catastrophic forgetting.
arXiv Detail & Related papers (2024-08-19T14:38:27Z) - Explorative Imitation Learning: A Path Signature Approach for Continuous Environments [9.416194245966022]
Continuous Imitation Learning from Observation (CILO) is a new method augmenting imitation learning with two important features.
CILO exploration allows for more diverse state transitions, requiring less expert trajectories and resulting in fewer training iterations.
It has the best overall performance of all imitation learning methods in all environments, outperforming the expert in two of them.
arXiv Detail & Related papers (2024-07-05T20:25:39Z) - Offline Imitation Learning with Model-based Reverse Augmentation [48.64791438847236]
We propose a novel model-based framework, called offline Imitation Learning with Self-paced Reverse Augmentation.
Specifically, we build a reverse dynamic model from the offline demonstrations, which can efficiently generate trajectories leading to the expert-observed states.
We use the subsequent reinforcement learning method to learn from the augmented trajectories and transit from expert-unobserved states to expert-observed states.
arXiv Detail & Related papers (2024-06-18T12:27:02Z) - Unsupervised Temporal Action Localization via Self-paced Incremental
Learning [57.55765505856969]
We present a novel self-paced incremental learning model to enhance clustering and localization training simultaneously.
We design two (constant- and variable- speed) incremental instance learning strategies for easy-to-hard model training, thus ensuring the reliability of these video pseudolabels.
arXiv Detail & Related papers (2023-12-12T16:00:55Z) - Self-Supervised Adversarial Imitation Learning [20.248498544165184]
Behavioural cloning teaches an agent how to behave via expert demonstrations.
Recent approaches use self-supervision of fully-observable unlabelled snapshots of the states to decode state pairs into actions.
Previous work uses goal-aware strategies to solve this issue.
We address this limitation by incorporating a discriminator into the original framework.
arXiv Detail & Related papers (2023-04-21T12:12:33Z) - Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model.
A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations.
We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Augmented Behavioral Cloning from Observation [14.45796459531414]
Imitation from observation is a technique that teaches an agent on how to mimic the behavior of an expert by observing only the sequence of states from the expert demonstrations.
We show empirically that our approach outperforms the state-of-the-art approaches in four different environments by a large margin.
arXiv Detail & Related papers (2020-04-28T13:56:36Z) - State-Only Imitation Learning for Dexterous Manipulation [63.03621861920732]
In this paper, we explore state-only imitation learning.
We train an inverse dynamics model and use it to predict actions for state-only demonstrations.
Our method performs on par with state-action approaches and considerably outperforms RL alone.
arXiv Detail & Related papers (2020-04-07T17:57:20Z) - Incremental Object Detection via Meta-Learning [77.55310507917012]
We propose a meta-learning approach that learns to reshape model gradients, such that information across incremental tasks is optimally shared.
In comparison to existing meta-learning methods, our approach is task-agnostic, allows incremental addition of new-classes and scales to high-capacity models for object detection.
arXiv Detail & Related papers (2020-03-17T13:40:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.