Quality Diversity Imitation Learning
- URL: http://arxiv.org/abs/2410.06151v1
- Date: Tue, 8 Oct 2024 15:49:33 GMT
- Title: Quality Diversity Imitation Learning
- Authors: Zhenglin Wan, Xingrui Yu, David Mark Bossens, Yueming Lyu, Qing Guo, Flint Xiaofeng Fan, Ivor Tsang,
- Abstract summary: We introduce the first generic framework for Quality Diversity Imitation Learning (QD-IL)
Our framework integrates the principles of quality diversity with adversarial imitation learning (AIL) methods, and can potentially improve any inverse reinforcement learning (IRL) method.
Our method even achieves 2x expert performance in the most challenging Humanoid environment.
- Score: 9.627530753815968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Imitation learning (IL) has shown great potential in various applications, such as robot control. However, traditional IL methods are usually designed to learn only one specific type of behavior since demonstrations typically correspond to a single expert. In this work, we introduce the first generic framework for Quality Diversity Imitation Learning (QD-IL), which enables the agent to learn a broad range of skills from limited demonstrations. Our framework integrates the principles of quality diversity with adversarial imitation learning (AIL) methods, and can potentially improve any inverse reinforcement learning (IRL) method. Empirically, our framework significantly improves the QD performance of GAIL and VAIL on the challenging continuous control tasks derived from Mujoco environments. Moreover, our method even achieves 2x expert performance in the most challenging Humanoid environment.
Related papers
- Is Diversity All You Need for Scalable Robotic Manipulation? [50.747150672933316]
We investigate the nuanced role of data diversity in robot learning by examining three critical dimensions-task (what to do), embodiment (which robot to use), and expert (who demonstrates)-challenging the conventional intuition of "more diverse is better"<n>We show that task diversity proves more critical than per-task demonstration quantity, benefiting transfer from diverse pre-training tasks to novel downstream scenarios.<n>We propose a distribution debiasing method to mitigate velocity ambiguity, the yielding GO-1-Pro achieves substantial performance gains of 15%, equivalent to using 2.5 times pre-training data.
arXiv Detail & Related papers (2025-07-08T17:52:44Z) - SkillBlender: Towards Versatile Humanoid Whole-Body Loco-Manipulation via Skill Blending [79.83865372778273]
We introduce SkillBlender, a novel hierarchical reinforcement learning framework for versatile humanoid loco-manipulation.<n>SkillBlender first pretrains goal-conditioned task-agnostic primitive skills, and then dynamically blends these skills to accomplish complex loco-manipulation tasks.<n>We also introduce SkillBench, a parallel, cross-embodiment, and diverse simulated benchmark containing three embodiments, four primitive skills, and eight challenging loco-manipulation tasks.
arXiv Detail & Related papers (2025-06-11T03:24:26Z) - Learning Adaptive Dexterous Grasping from Single Demonstrations [27.806856958659054]
This work tackles two key challenges: efficient skill acquisition from limited human demonstrations and context-driven skill selection.
AdaDexGrasp learns a library of grasping skills from a single human demonstration per skill and selects the most suitable one using a vision-language model (VLM)
We evaluate AdaDexGrasp in both simulation and real-world settings, showing that our approach significantly improves RL efficiency and enables learning human-like grasp strategies across varied object configurations.
arXiv Detail & Related papers (2025-03-26T04:05:50Z) - MoE-Loco: Mixture of Experts for Multitask Locomotion [52.04025933292957]
We present MoE-Loco, a framework for multitask locomotion for legged robots.<n>Our method enables a single policy to handle diverse terrains, while supporting quadrupedal and bipedal gaits.
arXiv Detail & Related papers (2025-03-11T15:53:54Z) - Imitation from Diverse Behaviors: Wasserstein Quality Diversity Imitation Learning with Single-Step Archive Exploration [37.836675202590406]
This work introduces Wasserstein Quality Diversity Imitation Learning (WQDIL)
It improves the stability of imitation learning in the quality diversity setting with latent adversarial training based on a Wasserstein Auto-Encoder (WAE)
It mitigates a behavior-overfitting issue using a measure-conditioned reward function with a single-step archive exploration bonus.
arXiv Detail & Related papers (2024-11-11T13:11:18Z) - Latent-Predictive Empowerment: Measuring Empowerment without a Simulator [56.53777237504011]
We present Latent-Predictive Empowerment (LPE), an algorithm that can compute empowerment in a more practical manner.
LPE learns large skillsets by maximizing an objective that is a principled replacement for the mutual information between skills and states.
arXiv Detail & Related papers (2024-10-15T00:41:18Z) - Explorative Imitation Learning: A Path Signature Approach for Continuous Environments [9.416194245966022]
Continuous Imitation Learning from Observation (CILO) is a new method augmenting imitation learning with two important features.
CILO exploration allows for more diverse state transitions, requiring less expert trajectories and resulting in fewer training iterations.
It has the best overall performance of all imitation learning methods in all environments, outperforming the expert in two of them.
arXiv Detail & Related papers (2024-07-05T20:25:39Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts [58.220879689376744]
Reinforcement learning (RL) is a powerful approach for acquiring a good-performing policy.
We propose textbfDiverse textbfSkill textbfLearning (Di-SkilL) for learning diverse skills.
We show on challenging robot simulation tasks that Di-SkilL can learn diverse and performant skills.
arXiv Detail & Related papers (2024-03-11T17:49:18Z) - Continuous Mean-Zero Disagreement-Regularized Imitation Learning
(CMZ-DRIL) [1.0057319866872687]
This paper presents a method called Continuous Mean-Zero Disagreement-Regularized Imitation Learning (CMZ-DRIL)
CMZ-DRIL uses reinforcement learning to minimize uncertainty among an ensemble of agents trained to model the expert demonstrations.
As demonstrated in a waypoint-navigation environment and in two MuJoCo environments, CMZ-DRIL can generate performant agents that behave more similarly to the expert.
arXiv Detail & Related papers (2024-03-02T01:40:37Z) - Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control [106.32794844077534]
This paper presents a study on using deep reinforcement learning to create dynamic locomotion controllers for bipedal robots.
We develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing.
This work pushes the limits of agility for bipedal robots through extensive real-world experiments.
arXiv Detail & Related papers (2024-01-30T10:48:43Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - A Central Motor System Inspired Pre-training Reinforcement Learning for Robotic Control [7.227887302864789]
We propose CMS-PRL, a pre-training reinforcement learning method inspired by the Central Motor System.
First, we introduce a fusion reward mechanism that combines the basic motor reward with mutual information reward.
Second, we design a skill encoding method inspired by the motor program of the basal ganglia, providing rich and continuous skill instructions.
Third, we propose a skill activity function to regulate motor skill activity, enabling the generation of skills with different activity levels.
arXiv Detail & Related papers (2023-11-14T00:49:12Z) - CasIL: Cognizing and Imitating Skills via a Dual Cognition-Action
Architecture [20.627616015484648]
Existing imitation learning approaches for robots still grapple with sub-optimal performance in complex tasks.
Heuristically, we extend the usual notion of action to a dual Cognition (high-level)-Action (low-level) architecture.
We propose a novel skill IL framework through human-robot interaction, called Cognition-Action-based Skill Imitation Learning (CasIL)
arXiv Detail & Related papers (2023-09-28T09:53:05Z) - Skill Disentanglement for Imitation Learning from Suboptimal
Demonstrations [60.241144377865716]
We consider the imitation of sub-optimal demonstrations, with both a small clean demonstration set and a large noisy set.
We propose method by evaluating and imitating at the sub-demonstration level, encoding action primitives of varying quality into different skills.
arXiv Detail & Related papers (2023-06-13T17:24:37Z) - Versatile Skill Control via Self-supervised Adversarial Imitation of
Unlabeled Mixed Motions [19.626042478612572]
We propose a cooperative adversarial method for obtaining versatile policies with controllable skill sets from unlabeled datasets.
We show that by utilizing unsupervised skill discovery in the generative imitation learning framework, novel and useful skills emerge with successful task fulfillment.
Finally, the obtained versatile policies are tested on an agile quadruped robot called Solo 8 and present faithful replications of diverse skills encoded in the demonstrations.
arXiv Detail & Related papers (2022-09-16T12:49:04Z) - Learning to Walk Autonomously via Reset-Free Quality-Diversity [73.08073762433376]
Quality-Diversity algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills.
Existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions.
This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments.
arXiv Detail & Related papers (2022-04-07T14:07:51Z) - Towards General and Autonomous Learning of Core Skills: A Case Study in
Locomotion [19.285099263193622]
We develop a learning framework that can learn sophisticated locomotion behavior for a wide spectrum of legged robots.
Our learning framework relies on a data-efficient, off-policy multi-task RL algorithm and a small set of reward functions that are semantically identical across robots.
For nine different types of robots, including a real-world quadruped robot, we demonstrate that the same algorithm can rapidly learn diverse and reusable locomotion skills.
arXiv Detail & Related papers (2020-08-06T08:23:55Z) - Soft Expert Reward Learning for Vision-and-Language Navigation [94.86954695912125]
Vision-and-Language Navigation (VLN) requires an agent to find a specified spot in an unseen environment by following natural language instructions.
We introduce a Soft Expert Reward Learning (SERL) model to overcome the reward engineering designing and generalisation problems of the VLN task.
arXiv Detail & Related papers (2020-07-21T14:17:36Z) - Learning Agile Robotic Locomotion Skills by Imitating Animals [72.36395376558984]
Reproducing the diverse and agile locomotion skills of animals has been a longstanding challenge in robotics.
We present an imitation learning system that enables legged robots to learn agile locomotion skills by imitating real-world animals.
arXiv Detail & Related papers (2020-04-02T02:56:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.