Combating False Negatives in Adversarial Imitation Learning
- URL: http://arxiv.org/abs/2002.00412v1
- Date: Sun, 2 Feb 2020 14:56:39 GMT
- Title: Combating False Negatives in Adversarial Imitation Learning
- Authors: Konrad Zolna, Chitwan Saharia, Leonard Boussioux, David Yu-Tung Hui,
Maxime Chevalier-Boisvert, Dzmitry Bahdanau and Yoshua Bengio
- Abstract summary: In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the desired behavior.
As the trained policy learns to be more successful, the negative examples become increasingly similar to expert ones.
We propose a method to alleviate the impact of false negatives and test it on the BabyAI environment.
- Score: 67.99941805086154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In adversarial imitation learning, a discriminator is trained to
differentiate agent episodes from expert demonstrations representing the
desired behavior. However, as the trained policy learns to be more successful,
the negative examples (the ones produced by the agent) become increasingly
similar to expert ones. Despite the fact that the task is successfully
accomplished in some of the agent's trajectories, the discriminator is trained
to output low values for them. We hypothesize that this inconsistent training
signal for the discriminator can impede its learning, and consequently leads to
worse overall performance of the agent. We show experimental evidence for this
hypothesis and that the 'False Negatives' (i.e. successful agent episodes)
significantly hinder adversarial imitation learning, which is the first
contribution of this paper. Then, we propose a method to alleviate the impact
of false negatives and test it on the BabyAI environment. This method
consistently improves sample efficiency over the baselines by at least an order
of magnitude.
Related papers
- Quantile-based Maximum Likelihood Training for Outlier Detection [5.902139925693801]
We introduce a quantile-based maximum likelihood objective for learning the inlier distribution to improve the outlier separation during inference.
Our approach fits a normalizing flow to pre-trained discriminative features and detects the outliers according to the evaluated log-likelihood.
arXiv Detail & Related papers (2023-08-20T22:27:54Z) - Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning [48.595574101874575]
In the real world, expert demonstrations are more likely to be imperfect.
A positive-unlabeled adversarial imitation learning algorithm is developed.
Agent policy will be optimized to cheat the discriminator and produce trajectories similar to those optimal expert demonstrations.
arXiv Detail & Related papers (2023-02-13T11:26:44Z) - Language Model Pre-training on True Negatives [109.73819321246062]
Discriminative pre-trained language models (PLMs) learn to predict original texts from intentionally corrupted ones.
Existing PLMs simply treat all corrupted texts as equal negative without any examination.
We design enhanced pre-training methods to counteract false negative predictions and encourage pre-training language models on true negatives.
arXiv Detail & Related papers (2022-12-01T12:24:19Z) - Balanced Adversarial Training: Balancing Tradeoffs between Fickleness
and Obstinacy in NLP Models [21.06607915149245]
We show that standard adversarial training methods may make a model more vulnerable to fickle adversarial examples.
We introduce Balanced Adversarial Training, which incorporates contrastive learning to increase robustness against both fickle and obstinate adversarial examples.
arXiv Detail & Related papers (2022-10-20T18:02:07Z) - Investigating the Role of Negatives in Contrastive Representation
Learning [59.30700308648194]
Noise contrastive learning is a popular technique for unsupervised representation learning.
We focus on disambiguating the role of one of these parameters: the number of negative examples.
We find that the results broadly agree with our theory, while our vision experiments are murkier with performance sometimes even being insensitive to the number of negatives.
arXiv Detail & Related papers (2021-06-18T06:44:16Z) - Incremental False Negative Detection for Contrastive Learning [95.68120675114878]
We introduce a novel incremental false negative detection for self-supervised contrastive learning.
During contrastive learning, we discuss two strategies to explicitly remove the detected false negatives.
Our proposed method outperforms other self-supervised contrastive learning frameworks on multiple benchmarks within a limited compute.
arXiv Detail & Related papers (2021-06-07T15:29:14Z) - AdCo: Adversarial Contrast for Efficient Learning of Unsupervised
Representations from Self-Trained Negative Adversaries [55.059844800514774]
We propose an Adrial Contrastive (AdCo) model to train representations that are hard to discriminate against positive queries.
Experiment results demonstrate the proposed Adrial Contrastive (AdCo) model achieves superior performances.
arXiv Detail & Related papers (2020-11-17T05:45:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.