Transfering Hierarchical Structure with Dual Meta Imitation Learning
- URL: http://arxiv.org/abs/2201.11981v1
- Date: Fri, 28 Jan 2022 08:22:38 GMT
- Title: Transfering Hierarchical Structure with Dual Meta Imitation Learning
- Authors: Chongkai Gao, Yizhou Jiang, Feng Chen
- Abstract summary: We propose a hierarchical meta imitation learning method where the high-level network and sub-skills are iteratively meta-learned with model-agnostic meta-learning.
We achieve state-of-the-art few-shot imitation learning performance on the Meta-world citemetaworld benchmark and competitive results on long-horizon tasks of Kitchen environments.
- Score: 4.868214177205893
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hierarchical Imitation Learning (HIL) is an effective way for robots to learn
sub-skills from long-horizon unsegmented demonstrations. However, the learned
hierarchical structure lacks the mechanism to transfer across multi-tasks or to
new tasks, which makes them have to learn from scratch when facing a new
situation. Transferring and reorganizing modular sub-skills require fast
adaptation ability of the whole hierarchical structure. In this work, we
propose Dual Meta Imitation Learning (DMIL), a hierarchical meta imitation
learning method where the high-level network and sub-skills are iteratively
meta-learned with model-agnostic meta-learning. DMIL uses the likelihood of
state-action pairs from each sub-skill as the supervision for the high-level
network adaptation, and use the adapted high-level network to determine
different data set for each sub-skill adaptation. We theoretically prove the
convergence of the iterative training process of DMIL and establish the
connection between DMIL and Expectation-Maximization algorithm. Empirically, we
achieve state-of-the-art few-shot imitation learning performance on the
Meta-world \cite{metaworld} benchmark and competitive results on long-horizon
tasks of Kitchen environments.
Related papers
- Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models [81.74999702045339]
Multi-Level Optimal Transport (MultiLevelOT) is a novel approach that advances the optimal transport for universal cross-tokenizer knowledge distillation.
Our method aligns the logit distributions of the teacher and the student at both token and sequence levels.
At the token level, MultiLevelOT integrates both global and local information by jointly optimizing all tokens within a sequence to enhance robustness.
arXiv Detail & Related papers (2024-12-19T04:51:06Z) - ConML: A Universal Meta-Learning Framework with Task-Level Contrastive Learning [49.447777286862994]
ConML is a universal meta-learning framework that can be applied to various meta-learning algorithms.
We demonstrate that ConML integrates seamlessly with optimization-based, metric-based, and amortization-based meta-learning algorithms.
arXiv Detail & Related papers (2024-10-08T12:22:10Z) - Meta-Learning via Classifier(-free) Guidance [5.812784742024491]
State-of-the-art meta-learning techniques do not optimize for zero-shot adaptation to unseen tasks.
We propose meta-learning techniques that use natural language guidance to achieve higher zero-shot performance.
arXiv Detail & Related papers (2022-10-17T11:09:35Z) - Meta-Learning with Self-Improving Momentum Target [72.98879709228981]
We propose Self-improving Momentum Target (SiMT) to improve the performance of a meta-learner.
SiMT generates the target model by adapting from the temporal ensemble of the meta-learner.
We show that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods.
arXiv Detail & Related papers (2022-10-11T06:45:15Z) - Meta-Learning with Fewer Tasks through Task Interpolation [67.03769747726666]
Current meta-learning algorithms require a large number of meta-training tasks, which may not be accessible in real-world scenarios.
By meta-learning with task gradient (MLTI), our approach effectively generates additional tasks by randomly sampling a pair of tasks and interpolating the corresponding features and labels.
Empirically, in our experiments on eight datasets from diverse domains, we find that the proposed general MLTI framework is compatible with representative meta-learning algorithms and consistently outperforms other state-of-the-art strategies.
arXiv Detail & Related papers (2021-06-04T20:15:34Z) - MetaGater: Fast Learning of Conditional Channel Gated Networks via
Federated Meta-Learning [46.79356071007187]
We propose a holistic approach to jointly train the backbone network and the channel gating.
We develop a federated meta-learning approach to jointly learn good meta-initializations for both backbone networks and gating modules.
arXiv Detail & Related papers (2020-11-25T04:26:23Z) - Meta-Learning of Structured Task Distributions in Humans and Machines [15.34209852089588]
We show that evaluating meta-learning remains a challenge, and can miss whether meta-learning actually uses the structure embedded within the tasks.
We train a standard meta-learning agent, a recurrent network trained with model-free reinforcement learning, and compare it with human performance.
We find a double dissociation in which humans do better in the structured task distribution whereas agents do better in the null task distribution.
arXiv Detail & Related papers (2020-10-05T20:18:10Z) - MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and
Architectures [61.73533544385352]
We propose a transferable perturbation, MetaPerturb, which is meta-learned to improve generalization performance on unseen data.
As MetaPerturb is a set-function trained over diverse distributions across layers and tasks, it can generalize heterogeneous tasks and architectures.
arXiv Detail & Related papers (2020-06-13T02:54:59Z) - Automated Relational Meta-learning [95.02216511235191]
We propose an automated relational meta-learning framework that automatically extracts the cross-task relations and constructs the meta-knowledge graph.
We conduct extensive experiments on 2D toy regression and few-shot image classification and the results demonstrate the superiority of ARML over state-of-the-art baselines.
arXiv Detail & Related papers (2020-01-03T07:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.