Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative
Adversarial Nets
- URL: http://arxiv.org/abs/2005.10622v2
- Date: Fri, 22 May 2020 01:05:30 GMT
- Title: Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative
Adversarial Nets
- Authors: Cong Fei, Bin Wang, Yuzheng Zhuang, Zongzhang Zhang, Jianye Hao,
Hongbo Zhang, Xuewu Ji and Wulong Liu
- Abstract summary: Triple-GAIL is able to learn skill selection and imitation jointly from both expert demonstrations and continuously generated experiences with data augmentation purpose.
Experiments on real driver trajectories and real-time strategy game datasets demonstrate that Triple-GAIL can better fit multi-modal behaviors close to the demonstrators.
- Score: 34.17829944466169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative adversarial imitation learning (GAIL) has shown promising results
by taking advantage of generative adversarial nets, especially in the field of
robot learning. However, the requirement of isolated single modal
demonstrations limits the scalability of the approach to real world scenarios
such as autonomous vehicles' demand for a proper understanding of human
drivers' behavior. In this paper, we propose a novel multi-modal GAIL
framework, named Triple-GAIL, that is able to learn skill selection and
imitation jointly from both expert demonstrations and continuously generated
experiences with data augmentation purpose by introducing an auxiliary skill
selector. We provide theoretical guarantees on the convergence to optima for
both of the generator and the selector respectively. Experiments on real driver
trajectories and real-time strategy game datasets demonstrate that Triple-GAIL
can better fit multi-modal behaviors close to the demonstrators and outperforms
state-of-the-art methods.
Related papers
- Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation [30.33381342502258]
Key challenge is unimodal bias, where multimodal segmentors over rely on certain modalities, causing performance drops when others are missing.
We develop the first framework for learning robust segmentor that can handle any combinations of visual modalities.
arXiv Detail & Related papers (2024-11-26T06:15:27Z) - DeepInteraction++: Multi-Modality Interaction for Autonomous Driving [80.8837864849534]
We introduce a novel modality interaction strategy that allows individual per-modality representations to be learned and maintained throughout.
DeepInteraction++ is a multi-modal interaction framework characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder.
Experiments demonstrate the superior performance of the proposed framework on both 3D object detection and end-to-end autonomous driving tasks.
arXiv Detail & Related papers (2024-08-09T14:04:21Z) - Drive Anywhere: Generalizable End-to-end Autonomous Driving with
Multi-modal Foundation Models [114.69732301904419]
We present an approach to apply end-to-end open-set (any environment/scene) autonomous driving that is capable of providing driving decisions from representations queryable by image and text.
Our approach demonstrates unparalleled results in diverse tests while achieving significantly greater robustness in out-of-distribution situations.
arXiv Detail & Related papers (2023-10-26T17:56:35Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - Generating Personas for Games with Multimodal Adversarial Imitation
Learning [47.70823327747952]
Reinforcement learning has been widely successful in producing agents capable of playing games at a human level.
Going beyond reinforcement learning is necessary to model a wide range of human playstyles.
This paper presents a novel imitation learning approach to generate multiple persona policies for playtesting.
arXiv Detail & Related papers (2023-08-15T06:58:19Z) - A Two-stage Fine-tuning Strategy for Generalizable Manipulation Skill of
Embodied AI [15.480968464853769]
We propose a novel two-stage fine-tuning strategy to enhance the generalization capability of our model based on the Maniskill2 benchmark.
Our findings highlight the potential of our method to improve the generalization abilities of Embodied AI models and pave the way for their ractical applications in real-world scenarios.
arXiv Detail & Related papers (2023-07-21T04:15:36Z) - Generalized Multimodal ELBO [11.602089225841631]
Multiple data types naturally co-occur when describing real-world phenomena and learning from them is a long-standing goal in machine learning research.
Existing self-supervised generative models approximating an ELBO are not able to fulfill all desired requirements of multimodal models.
We propose a new, generalized ELBO formulation for multimodal data that overcomes these limitations.
arXiv Detail & Related papers (2021-05-06T07:05:00Z) - UPDeT: Universal Multi-agent Reinforcement Learning via Policy
Decoupling with Transformers [108.92194081987967]
We make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks.
Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy.
The proposed model, named as Universal Policy Decoupling Transformer (UPDeT), further relaxes the action restriction and makes the multi-agent task's decision process more explainable.
arXiv Detail & Related papers (2021-01-20T07:24:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.