Related papers: Information Maximizing Curriculum: A Curriculum-Based Approach for Imitating Diverse Skills

Information Maximizing Curriculum: A Curriculum-Based Approach for Imitating Diverse Skills

URL: http://arxiv.org/abs/2303.15349v2
Date: Tue, 31 Oct 2023 14:21:19 GMT
Title: Information Maximizing Curriculum: A Curriculum-Based Approach for Imitating Diverse Skills
Authors: Denis Blessing, Onur Celik, Xiaogang Jia, Moritz Reuss, Maximilian Xiling Li, Rudolf Lioutikov, Gerhard Neumann
Abstract summary: We propose a curriculum-based approach that assigns a weight to each data point and encourages the model to specialize in the data it can represent. To cover all modes and thus, enable diverse behavior, we extend our approach to a mixture of experts (MoE) policy, where each mixture component selects its own subset of the training data for learning. A novel, maximum entropy-based objective is proposed to achieve full coverage of the dataset, thereby enabling the policy to encompass all modes within the data distribution.
Score: 14.685043874797742
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Imitation learning uses data for training policies to solve complex tasks. However, when the training data is collected from human demonstrators, it often leads to multimodal distributions because of the variability in human actions. Most imitation learning methods rely on a maximum likelihood (ML) objective to learn a parameterized policy, but this can result in suboptimal or unsafe behavior due to the mode-averaging property of the ML objective. In this work, we propose Information Maximizing Curriculum, a curriculum-based approach that assigns a weight to each data point and encourages the model to specialize in the data it can represent, effectively mitigating the mode-averaging problem by allowing the model to ignore data from modes it cannot represent. To cover all modes and thus, enable diverse behavior, we extend our approach to a mixture of experts (MoE) policy, where each mixture component selects its own subset of the training data for learning. A novel, maximum entropy-based objective is proposed to achieve full coverage of the dataset, thereby enabling the policy to encompass all modes within the data distribution. We demonstrate the effectiveness of our approach on complex simulated control tasks using diverse human demonstrations, achieving superior performance compared to state-of-the-art methods.

Related papers

IMLE Policy: Fast and Sample Efficient Visuomotor Policy Learning via Implicit Maximum Likelihood Estimation [3.7584322469996896]
IMLE Policy is a novel behaviour cloning approach based on Implicit Maximum Likelihood Estimation (IMLE) It excels in low-data regimes, effectively learning from minimal demonstrations and requiring 38% less data on average to match the performance of baseline methods in learning complex multi-modal behaviours. We validate our approach across diverse manipulation tasks in simulated and real-world environments, showcasing its ability to capture complex behaviours under data constraints.
arXiv Detail & Related papers (2025-02-17T23:22:49Z)
PoCo: Policy Composition from and for Heterogeneous Robot Learning [44.1315170137613]
Current methods usually collect and pool all data from one domain to train a single policy. We present a flexible approach, dubbed Policy Composition, to combine information across diverse modalities and domains. Our method can use task-level composition for multi-task manipulation and be composed with analytic cost functions to adapt policy behaviors at inference time.
arXiv Detail & Related papers (2024-02-04T14:51:49Z)
Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training. In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk. In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z)
Learning Unseen Modality Interaction [54.23533023883659]
Multimodal learning assumes all modality combinations of interest are available during training to learn cross-modal correspondences. We pose the problem of unseen modality interaction and introduce a first solution. It exploits a module that projects the multidimensional features of different modalities into a common space with rich information preserved.
arXiv Detail & Related papers (2023-06-22T10:53:10Z)
ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP) ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective. We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z)
Curriculum-Based Imitation of Versatile Skills [15.97723808124603]
Learning skills by imitation is a promising concept for the intuitive teaching of robots. A common way to learn such skills is to learn a parametric model by maximizing the likelihood given the demonstrations. Yet, human demonstrations are often multi-modal, i.e., the same task is solved in multiple ways.
arXiv Detail & Related papers (2023-04-11T12:10:41Z)
Multi-model Ensemble Learning Method for Human Expression Recognition [31.76775306959038]
We propose our solution based on the ensemble learning method to capture large amounts of real-life data. We conduct many experiments on the AffWild2 dataset of the ABAW2022 Challenge, and the results demonstrate the effectiveness of our solution.
arXiv Detail & Related papers (2022-03-28T03:15:06Z)
Evaluating model-based planning and planner amortization for continuous control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning. We find that well-tuned model-free agents are strong baselines even for high DoF control problems. We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z)
Maximum Likelihood Estimation for Multimodal Learning with Missing Modality [10.91899856969822]
We propose an efficient approach based on maximum likelihood estimation to incorporate the knowledge in the modality-missing data. Our results demonstrate the effectiveness of the proposed approach, even when 95% of the training data has missing modality.
arXiv Detail & Related papers (2021-08-24T03:50:54Z)
Double Meta-Learning for Data Efficient Policy Optimization in Non-Stationary Environments [12.45281856559346]
We are interested in learning models of non-stationary environments, which can be framed as a multi-task learning problem. Model-free reinforcement learning algorithms can achieve good performance in multi-task learning at a cost of extensive sampling. While model-based approaches are among the most data efficient learning algorithms, they still struggle with complex tasks and model uncertainties.
arXiv Detail & Related papers (2020-11-21T03:19:35Z)
Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling [126.69933134648541]
We present a meta-reinforcement learning algorithm that is both efficient and extrapolates well when faced with out-of-distribution tasks at test time. Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data.
arXiv Detail & Related papers (2020-06-12T13:34:46Z)
Task-Feature Collaborative Learning with Application to Personalized Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL) Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks. As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.