DiffAIL: Diffusion Adversarial Imitation Learning
- URL: http://arxiv.org/abs/2312.06348v2
- Date: Tue, 12 Dec 2023 03:47:38 GMT
- Title: DiffAIL: Diffusion Adversarial Imitation Learning
- Authors: Bingzheng Wang, Guoqiang Wu, Teng Pang, Yan Zhang, Yilong Yin
- Abstract summary: Imitation learning aims to solve the problem of defining reward functions in real-world decision-making tasks.
We propose a method named diffusion adversarial imitation learning (DiffAIL)
Our method achieves state-of-the-art performance and significantly surpasses expert demonstration on two benchmark tasks.
- Score: 32.90853955228524
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imitation learning aims to solve the problem of defining reward functions in
real-world decision-making tasks. The current popular approach is the
Adversarial Imitation Learning (AIL) framework, which matches expert
state-action occupancy measures to obtain a surrogate reward for forward
reinforcement learning. However, the traditional discriminator is a simple
binary classifier and doesn't learn an accurate distribution, which may result
in failing to identify expert-level state-action pairs induced by the policy
interacting with the environment. To address this issue, we propose a method
named diffusion adversarial imitation learning (DiffAIL), which introduces the
diffusion model into the AIL framework. Specifically, DiffAIL models the
state-action pairs as unconditional diffusion models and uses diffusion loss as
part of the discriminator's learning objective, which enables the discriminator
to capture better expert demonstrations and improve generalization.
Experimentally, the results show that our method achieves state-of-the-art
performance and significantly surpasses expert demonstration on two benchmark
tasks, including the standard state-action setting and state-only settings. Our
code can be available at the link https://github.com/ML-Group-SDU/DiffAIL.
Related papers
- Diffusion-Reward Adversarial Imitation Learning [33.81857550294019]
Imitation learning aims to learn a policy from observing expert demonstrations without access to reward signals from environments.
Generative adversarial imitation learning (GAIL) formulates imitation learning as adversarial learning.
Inspired by the recent dominance of diffusion models in generative modeling, this work proposes Diffusion-Reward Adrial Imitation Learning (DRAIL)
arXiv Detail & Related papers (2024-05-25T11:53:23Z) - Model Will Tell: Training Membership Inference for Diffusion Models [15.16244745642374]
Training Membership Inference (TMI) task aims to determine whether a specific sample has been used in the training process of a target model.
In this paper, we explore a novel perspective for the TMI task by leveraging the intrinsic generative priors within the diffusion model.
arXiv Detail & Related papers (2024-03-13T12:52:37Z) - Expert Proximity as Surrogate Rewards for Single Demonstration Imitation Learning [51.972577689963714]
Single-demonstration imitation learning (IL) is a practical approach for real-world applications where acquiring multiple expert demonstrations is costly or infeasible.
In contrast to typical IL settings, single-demonstration IL involves an agent having access to only one expert trajectory.
We highlight the issue of sparse reward signals in this setting and propose to mitigate this issue through our proposed Transition Discriminator-based IL (TDIL) method.
arXiv Detail & Related papers (2024-02-01T23:06:19Z) - Guided Diffusion from Self-Supervised Diffusion Features [49.78673164423208]
Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or pretraining.
We propose a framework to extract guidance from, and specifically for, diffusion models.
arXiv Detail & Related papers (2023-12-14T11:19:11Z) - Diffusion Policies as an Expressive Policy Class for Offline
Reinforcement Learning [70.20191211010847]
Offline reinforcement learning (RL) aims to learn an optimal policy using a previously collected static dataset.
We introduce Diffusion Q-learning (Diffusion-QL) that utilizes a conditional diffusion model to represent the policy.
We show that our method can achieve state-of-the-art performance on the majority of the D4RL benchmark tasks.
arXiv Detail & Related papers (2022-08-12T09:54:11Z) - Imitating, Fast and Slow: Robust learning from demonstrations via
decision-time planning [96.72185761508668]
Planning at Test-time (IMPLANT) is a new meta-algorithm for imitation learning.
We demonstrate that IMPLANT significantly outperforms benchmark imitation learning approaches on standard control environments.
arXiv Detail & Related papers (2022-04-07T17:16:52Z) - Towards Equal Opportunity Fairness through Adversarial Learning [64.45845091719002]
Adversarial training is a common approach for bias mitigation in natural language processing.
We propose an augmented discriminator for adversarial training, which takes the target class as input to create richer features.
arXiv Detail & Related papers (2022-03-12T02:22:58Z) - Robust Generalization despite Distribution Shift via Minimum
Discriminating Information [46.164498176119665]
We introduce a modeling framework where, in addition to training data, we have partial structural knowledge of the shifted test distribution.
We employ the principle of minimum discriminating information to embed the available prior knowledge.
We obtain explicit generalization bounds with respect to the unknown shifted distribution.
arXiv Detail & Related papers (2021-06-08T15:25:35Z) - Spatial Contrastive Learning for Few-Shot Classification [9.66840768820136]
We propose a novel attention-based spatial contrastive objective to learn locally discriminative and class-agnostic features.
With extensive experiments, we show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-26T23:39:41Z) - Learning From Multiple Experts: Self-paced Knowledge Distillation for
Long-tailed Classification [106.08067870620218]
We propose a self-paced knowledge distillation framework, termed Learning From Multiple Experts (LFME)
We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model.
We conduct extensive experiments and demonstrate that our method is able to achieve superior performances compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-06T12:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.