Learning from demonstration using products of experts: applications to
manipulation and task prioritization
- URL: http://arxiv.org/abs/2010.03505v1
- Date: Wed, 7 Oct 2020 16:24:41 GMT
- Title: Learning from demonstration using products of experts: applications to
manipulation and task prioritization
- Authors: Emmanuel Pignat, Jo\~ao Silv\'erio and Sylvain Calinon
- Abstract summary: We show that the fusion of models in different task spaces can be expressed as a product of experts (PoE)
Multiple experiments are presented to show that learning the different models jointly in the PoE framework significantly improves the quality of the model.
- Score: 12.378784643460474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Probability distributions are key components of many learning from
demonstration (LfD) approaches. While the configuration of a manipulator is
defined by its joint angles, poses are often best explained within several task
spaces. In many approaches, distributions within relevant task spaces are
learned independently and only combined at the control level. This
simplification implies several problems that are addressed in this work. We
show that the fusion of models in different task spaces can be expressed as a
product of experts (PoE), where the probabilities of the models are multiplied
and renormalized so that it becomes a proper distribution of joint angles.
Multiple experiments are presented to show that learning the different models
jointly in the PoE framework significantly improves the quality of the model.
The proposed approach particularly stands out when the robot has to learn
competitive or hierarchical objectives. Training the model jointly usually
relies on contrastive divergence, which requires costly approximations that can
affect performance. We propose an alternative strategy using variational
inference and mixture model approximations. In particular, we show that the
proposed approach can be extended to PoE with a nullspace structure (PoENS),
where the model is able to recover tasks that are masked by the resolution of
higher-level objectives.
Related papers
- Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling.
Our research explores task-specific model pruning to inform decisions about designing SMoE architectures.
We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z) - Concrete Subspace Learning based Interference Elimination for Multi-task
Model Fusion [86.6191592951269]
Merging models fine-tuned from common extensively pretrained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multitask model that performs well across diverse tasks.
We propose the CONtinuous relaxation dis (Concrete) subspace learning method to identify a common lowdimensional subspace and utilize its shared information track interference problem without sacrificing performance.
arXiv Detail & Related papers (2023-12-11T07:24:54Z) - Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles [95.49699178874683]
We propose DiffDiv, an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs)
We show that DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features.
We show that DPM-guided diversification is sufficient to remove dependence on shortcut cues, without a need for additional supervised signals.
arXiv Detail & Related papers (2023-11-23T15:47:33Z) - Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts
in Underspecified Visual Tasks [92.32670915472099]
We propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs)
We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.
arXiv Detail & Related papers (2023-10-03T17:37:52Z) - SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models [22.472167814814448]
We propose a new model-based imitation learning algorithm named Separated Model-based Adversarial Imitation Learning (SeMAIL)
Our method achieves near-expert performance on various visual control tasks with complex observations and the more challenging tasks with different backgrounds from expert observations.
arXiv Detail & Related papers (2023-06-19T04:33:44Z) - Switchable Representation Learning Framework with Self-compatibility [50.48336074436792]
We propose a Switchable representation learning Framework with Self-Compatibility (SFSC)
SFSC generates a series of compatible sub-models with different capacities through one training process.
SFSC achieves state-of-the-art performance on the evaluated datasets.
arXiv Detail & Related papers (2022-06-16T16:46:32Z) - Ensemble Making Few-Shot Learning Stronger [4.17701749612924]
This paper explores an ensemble approach to reduce the variance and introduces fine-tuning and feature attention strategies to calibrate relation-level features.
Results on several few-shot relation learning tasks show that our model significantly outperforms the previous state-of-the-art models.
arXiv Detail & Related papers (2021-05-12T17:11:10Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.