Related papers: Self-Supervised Adversarial Imitation Learning

Self-Supervised Adversarial Imitation Learning

URL: http://arxiv.org/abs/2304.10914v1
Date: Fri, 21 Apr 2023 12:12:33 GMT
Title: Self-Supervised Adversarial Imitation Learning
Authors: Juarez Monteiro, Nathan Gavenski, Felipe Meneguzzi and Rodrigo C. Barros
Abstract summary: Behavioural cloning teaches an agent how to behave via expert demonstrations. Recent approaches use self-supervision of fully-observable unlabelled snapshots of the states to decode state pairs into actions. Previous work uses goal-aware strategies to solve this issue. We address this limitation by incorporating a discriminator into the original framework.
Score: 20.248498544165184
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Behavioural cloning is an imitation learning technique that teaches an agent how to behave via expert demonstrations. Recent approaches use self-supervision of fully-observable unlabelled snapshots of the states to decode state pairs into actions. However, the iterative learning scheme employed by these techniques is prone to get trapped into bad local minima. Previous work uses goal-aware strategies to solve this issue. However, this requires manual intervention to verify whether an agent has reached its goal. We address this limitation by incorporating a discriminator into the original framework, offering two key advantages and directly solving a learning problem previous work had. First, it disposes of the manual intervention requirement. Second, it helps in learning by guiding function approximation based on the state transition of the expert's trajectories. Third, the discriminator solves a learning issue commonly present in the policy model, which is to sometimes perform a `no action' within the environment until the agent finally halts.

Related papers

Efficient Active Imitation Learning with Random Network Distillation [8.517915878774756]
Random Network Distillation DAgger (RND-DAgger) is a new active imitation learning method. It limits expert querying by using a learned state-based out-of-distribution measure to trigger interventions. We evaluate RND-DAgger against traditional imitation learning and other active approaches in 3D video games and in a robotic task.
arXiv Detail & Related papers (2024-11-04T08:50:52Z)
Agent-Aware Training for Agent-Agnostic Action Advising in Deep Reinforcement Learning [37.70609910232786]
Action advising endeavors to leverage supplementary guidance from expert teachers to alleviate the issue of sampling inefficiency in Deep Reinforcement Learning (DRL) Previous agent-specific action advising methods are hindered by imperfections in the agent itself, while agent-agnostic approaches exhibit limited adaptability to the learning agent. We propose a novel framework called Agent-Aware trAining yet Agent-Agnostic Action Advising (A7) to strike a balance between the two.
arXiv Detail & Related papers (2023-11-28T14:09:43Z)
A Study of Forward-Forward Algorithm for Self-Supervised Learning [65.268245109828]
We study the performance of forward-forward vs. backpropagation for self-supervised representation learning. Our main finding is that while the forward-forward algorithm performs comparably to backpropagation during (self-supervised) training, the transfer performance is significantly lagging behind in all the studied settings.
arXiv Detail & Related papers (2023-09-21T10:14:53Z)
Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows [58.762959061522736]
offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions. We build upon recent works on learning policies in latent action spaces and use a special form of Normalizing Flows for constructing a generative model. We evaluate our method on various locomotion and navigation tasks, demonstrating that our approach outperforms recently proposed algorithms.
arXiv Detail & Related papers (2022-11-20T21:57:10Z)
Chain of Thought Imitation with Procedure Cloning [129.62135987416164]
We propose procedure cloning, which applies supervised sequence prediction to imitate the series of expert computations. We show that imitating the intermediate computations of an expert's behavior enables procedure cloning to learn policies exhibiting significant generalization to unseen environment configurations.
arXiv Detail & Related papers (2022-05-22T13:14:09Z)
Domain-Robust Visual Imitation Learning with Mutual Information Constraints [0.0]
We introduce a new algorithm called Disentangling Generative Adversarial Imitation Learning (DisentanGAIL) Our algorithm enables autonomous agents to learn directly from high dimensional observations of an expert performing a task.
arXiv Detail & Related papers (2021-03-08T21:18:58Z)
Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings. We develop an algorithm to train the policy iteratively on new data collected by the system. We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z)
SAFARI: Safe and Active Robot Imitation Learning with Imagination [16.967930721746676]
SAFARI is a novel active learning and control algorithm. It allows an agent to request further human demonstrations when these out-of-distribution situations are met. We show how this method enables the agent to autonomously predict failure rapidly and safely.
arXiv Detail & Related papers (2020-11-18T23:43:59Z)
Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection [80.68446022994492]
In this work, we explore the unsupervised learning paradigm which can potentially work with unlabeled text corpora. Our model builds upon the recent work on Noisy Student Training, a semi-supervised learning approach that extends the idea of self-training.
arXiv Detail & Related papers (2020-10-29T05:29:26Z)
Imitating Unknown Policies via Exploration [18.78730427200346]
Behavioral cloning is an imitation learning technique that teaches an agent how to behave through expert demonstrations. Recent approaches use self-supervision of fully-observable unlabeled snapshots of the states to decode state-pairs into actions. We address these limitations incorporating a two-phase model into the original framework, which learns from unlabeled observations via exploration.
arXiv Detail & Related papers (2020-08-13T03:03:35Z)
Safe Reinforcement Learning via Curriculum Induction [94.67835258431202]
In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly. Existing safe reinforcement learning methods make an agent rely on priors that let it avoid dangerous situations. This paper presents an alternative approach inspired by human teaching, where an agent learns under the supervision of an automatic instructor.
arXiv Detail & Related papers (2020-06-22T10:48:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.