Self-Supervised Adversarial Imitation Learning
- URL: http://arxiv.org/abs/2304.10914v1
- Date: Fri, 21 Apr 2023 12:12:33 GMT
- Title: Self-Supervised Adversarial Imitation Learning
- Authors: Juarez Monteiro, Nathan Gavenski, Felipe Meneguzzi and Rodrigo C.
Barros
- Abstract summary: Behavioural cloning teaches an agent how to behave via expert demonstrations.
Recent approaches use self-supervision of fully-observable unlabelled snapshots of the states to decode state pairs into actions.
Previous work uses goal-aware strategies to solve this issue.
We address this limitation by incorporating a discriminator into the original framework.
- Score: 20.248498544165184
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Behavioural cloning is an imitation learning technique that teaches an agent
how to behave via expert demonstrations. Recent approaches use self-supervision
of fully-observable unlabelled snapshots of the states to decode state pairs
into actions. However, the iterative learning scheme employed by these
techniques is prone to get trapped into bad local minima. Previous work uses
goal-aware strategies to solve this issue. However, this requires manual
intervention to verify whether an agent has reached its goal. We address this
limitation by incorporating a discriminator into the original framework,
offering two key advantages and directly solving a learning problem previous
work had. First, it disposes of the manual intervention requirement. Second, it
helps in learning by guiding function approximation based on the state
transition of the expert's trajectories. Third, the discriminator solves a
learning issue commonly present in the policy model, which is to sometimes
perform a `no action' within the environment until the agent finally halts.
Related papers
- Efficient Active Imitation Learning with Random Network Distillation [8.517915878774756]
Random Network Distillation DAgger (RND-DAgger) is a new active imitation learning method.
It limits expert querying by using a learned state-based out-of-distribution measure to trigger interventions.
We evaluate RND-DAgger against traditional imitation learning and other active approaches in 3D video games and in a robotic task.
arXiv Detail & Related papers (2024-11-04T08:50:52Z) - Agent-Aware Training for Agent-Agnostic Action Advising in Deep
Reinforcement Learning [37.70609910232786]
Action advising endeavors to leverage supplementary guidance from expert teachers to alleviate the issue of sampling inefficiency in Deep Reinforcement Learning (DRL)
Previous agent-specific action advising methods are hindered by imperfections in the agent itself, while agent-agnostic approaches exhibit limited adaptability to the learning agent.
We propose a novel framework called Agent-Aware trAining yet Agent-Agnostic Action Advising (A7) to strike a balance between the two.
arXiv Detail & Related papers (2023-11-28T14:09:43Z) - A Study of Forward-Forward Algorithm for Self-Supervised Learning [65.268245109828]
We study the performance of forward-forward vs. backpropagation for self-supervised representation learning.
Our main finding is that while the forward-forward algorithm performs comparably to backpropagation during (self-supervised) training, the transfer performance is significantly lagging behind in all the studied settings.
arXiv Detail & Related papers (2023-09-21T10:14:53Z) - Let Offline RL Flow: Training Conservative Agents in the Latent Space of
Normalizing Flows [58.762959061522736]
offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions.
We build upon recent works on learning policies in latent action spaces and use a special form of Normalizing Flows for constructing a generative model.
We evaluate our method on various locomotion and navigation tasks, demonstrating that our approach outperforms recently proposed algorithms.
arXiv Detail & Related papers (2022-11-20T21:57:10Z) - Chain of Thought Imitation with Procedure Cloning [129.62135987416164]
We propose procedure cloning, which applies supervised sequence prediction to imitate the series of expert computations.
We show that imitating the intermediate computations of an expert's behavior enables procedure cloning to learn policies exhibiting significant generalization to unseen environment configurations.
arXiv Detail & Related papers (2022-05-22T13:14:09Z) - Domain-Robust Visual Imitation Learning with Mutual Information
Constraints [0.0]
We introduce a new algorithm called Disentangling Generative Adversarial Imitation Learning (DisentanGAIL)
Our algorithm enables autonomous agents to learn directly from high dimensional observations of an expert performing a task.
arXiv Detail & Related papers (2021-03-08T21:18:58Z) - Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings.
We develop an algorithm to train the policy iteratively on new data collected by the system.
We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z) - SAFARI: Safe and Active Robot Imitation Learning with Imagination [16.967930721746676]
SAFARI is a novel active learning and control algorithm.
It allows an agent to request further human demonstrations when these out-of-distribution situations are met.
We show how this method enables the agent to autonomously predict failure rapidly and safely.
arXiv Detail & Related papers (2020-11-18T23:43:59Z) - Combining Self-Training and Self-Supervised Learning for Unsupervised
Disfluency Detection [80.68446022994492]
In this work, we explore the unsupervised learning paradigm which can potentially work with unlabeled text corpora.
Our model builds upon the recent work on Noisy Student Training, a semi-supervised learning approach that extends the idea of self-training.
arXiv Detail & Related papers (2020-10-29T05:29:26Z) - Imitating Unknown Policies via Exploration [18.78730427200346]
Behavioral cloning is an imitation learning technique that teaches an agent how to behave through expert demonstrations.
Recent approaches use self-supervision of fully-observable unlabeled snapshots of the states to decode state-pairs into actions.
We address these limitations incorporating a two-phase model into the original framework, which learns from unlabeled observations via exploration.
arXiv Detail & Related papers (2020-08-13T03:03:35Z) - Safe Reinforcement Learning via Curriculum Induction [94.67835258431202]
In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly.
Existing safe reinforcement learning methods make an agent rely on priors that let it avoid dangerous situations.
This paper presents an alternative approach inspired by human teaching, where an agent learns under the supervision of an automatic instructor.
arXiv Detail & Related papers (2020-06-22T10:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.