Adversarial Imitation Learning with Trajectorial Augmentation and
Correction
- URL: http://arxiv.org/abs/2103.13887v2
- Date: Fri, 26 Mar 2021 11:39:16 GMT
- Title: Adversarial Imitation Learning with Trajectorial Augmentation and
Correction
- Authors: Dafni Antotsiou, Carlo Ciliberto and Tae-Kyun Kim
- Abstract summary: We introduce a novel augmentation method which preserves the success of the augmented trajectories.
We develop an adversarial data augmented imitation architecture to train an imitation agent using synthetic experts.
Experiments show that our data augmentation strategy can improve accuracy and convergence time of adversarial imitation.
- Score: 61.924411952657756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Imitation Learning requires a large number of expert demonstrations,
which are not always easy to obtain, especially for complex tasks. A way to
overcome this shortage of labels is through data augmentation. However, this
cannot be easily applied to control tasks due to the sequential nature of the
problem. In this work, we introduce a novel augmentation method which preserves
the success of the augmented trajectories. To achieve this, we introduce a
semi-supervised correction network that aims to correct distorted expert
actions. To adequately test the abilities of the correction network, we develop
an adversarial data augmented imitation architecture to train an imitation
agent using synthetic experts. Additionally, we introduce a metric to measure
diversity in trajectory datasets. Experiments show that our data augmentation
strategy can improve accuracy and convergence time of adversarial imitation
while preserving the diversity between the generated and real trajectories.
Related papers
- Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified Trajectory [53.37473225728298]
The rapid evolution of deep learning and large language models has led to an exponential growth in the demand for training data.
Matching Training Trajectories (MTT) has been a prominent approach, which replicates the training trajectory of an expert network on real data with a synthetic dataset.
We introduce a novel method called Matching Convexified Trajectory (MCT), which aims to provide better guidance for the student trajectory.
arXiv Detail & Related papers (2024-06-28T11:06:46Z) - Boosting Model Resilience via Implicit Adversarial Data Augmentation [20.768174896574916]
We propose to augment the deep features of samples by incorporating adversarial and anti-adversarial perturbation distributions.
We then theoretically reveal that our augmentation process approximates the optimization of a surrogate loss function.
We conduct extensive experiments across four common biased learning scenarios.
arXiv Detail & Related papers (2024-04-25T03:22:48Z) - Meta-Learning with Versatile Loss Geometries for Fast Adaptation Using
Mirror Descent [44.56938629818211]
A fundamental challenge in meta-learning is how to quickly "adapt" the extracted prior in order to train a task-specific model.
Existing approaches deal with this challenge using a preconditioner that enhances convergence of the per-task training process.
The present contribution addresses this limitation by learning a nonlinear mirror map, which induces a versatile distance metric.
arXiv Detail & Related papers (2023-12-20T23:45:06Z) - Stochastic Vision Transformers with Wasserstein Distance-Aware Attention [8.407731308079025]
Self-supervised learning is one of the most promising approaches to acquiring knowledge from limited labeled data.
We introduce a new vision transformer that integrates uncertainty and distance awareness into self-supervised learning pipelines.
Our proposed method achieves superior accuracy and calibration, surpassing the self-supervised baseline in a wide range of experiments on a variety of datasets.
arXiv Detail & Related papers (2023-11-30T15:53:37Z) - CONVERT:Contrastive Graph Clustering with Reliable Augmentation [110.46658439733106]
We propose a novel CONtrastiVe Graph ClustEring network with Reliable AugmenTation (CONVERT)
In our method, the data augmentations are processed by the proposed reversible perturb-recover network.
To further guarantee the reliability of semantics, a novel semantic loss is presented to constrain the network.
arXiv Detail & Related papers (2023-08-17T13:07:09Z) - Implicit Counterfactual Data Augmentation for Robust Learning [24.795542869249154]
This study proposes an Implicit Counterfactual Data Augmentation method to remove spurious correlations and make stable predictions.
Experiments have been conducted across various biased learning scenarios covering both image and text datasets.
arXiv Detail & Related papers (2023-04-26T10:36:40Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Bridging the Imitation Gap by Adaptive Insubordination [88.35564081175642]
We show that when the teaching agent makes decisions with access to privileged information, this information is marginalized during imitation learning.
We propose 'Adaptive Insubordination' (ADVISOR) to address this gap.
ADVISOR dynamically weights imitation and reward-based reinforcement learning losses during training, enabling on-the-fly switching between imitation and exploration.
arXiv Detail & Related papers (2020-07-23T17:59:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.