From Demonstrations to Task-Space Specifications: Using Causal Analysis
to Extract Rule Parameterization from Demonstrations
- URL: http://arxiv.org/abs/2006.11300v1
- Date: Mon, 8 Jun 2020 00:21:13 GMT
- Title: From Demonstrations to Task-Space Specifications: Using Causal Analysis
to Extract Rule Parameterization from Demonstrations
- Authors: Daniel Angelov, Yordan Hristov, Subramanian Ramamoorthy
- Abstract summary: We show that it is possible to learn generative models for distinct user behavioural types extracted from human demonstrations.
We use these models to differentiate between user types and to find cases with overlapping solutions.
Our method successfully identifies the correct type, within the specified time, in 99% [97.8 - 99.8] of the cases, which outperforms an IRL baseline.
- Score: 16.330400985738205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning models of user behaviour is an important problem that is broadly
applicable across many application domains requiring human-robot interaction.
In this work, we show that it is possible to learn generative models for
distinct user behavioural types, extracted from human demonstrations, by
enforcing clustering of preferred task solutions within the latent space. We
use these models to differentiate between user types and to find cases with
overlapping solutions. Moreover, we can alter an initially guessed solution to
satisfy the preferences that constitute a particular user type by
backpropagating through the learned differentiable models. An advantage of
structuring generative models in this way is that we can extract causal
relationships between symbols that might form part of the user's specification
of the task, as manifested in the demonstrations. We further parameterize these
specifications through constraint optimization in order to find a safety
envelope under which motion planning can be performed. We show that the
proposed method is capable of correctly distinguishing between three user
types, who differ in degrees of cautiousness in their motion, while performing
the task of moving objects with a kinesthetically driven robot in a tabletop
environment. Our method successfully identifies the correct type, within the
specified time, in 99% [97.8 - 99.8] of the cases, which outperforms an IRL
baseline. We also show that our proposed method correctly changes a default
trajectory to one satisfying a particular user specification even with unseen
objects. The resulting trajectory is shown to be directly implementable on a
PR2 humanoid robot completing the same task.
Related papers
- Instruction-Following Pruning for Large Language Models [58.329978053711024]
We move beyond the traditional static pruning approach of determining a fixed pruning mask for a model.
In our method, the pruning mask is input-dependent and adapts dynamically based on the information described in a user instruction.
Our approach, termed "instruction-following pruning", introduces a sparse mask predictor that takes the user instruction as input and dynamically selects the most relevant model parameters for the given task.
arXiv Detail & Related papers (2025-01-03T20:19:14Z) - Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications.
Ensuring their alignment with the diverse preferences of individual users has become a critical challenge.
We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z) - NegMerge: Consensual Weight Negation for Strong Machine Unlearning [21.081262106431506]
Machine unlearning aims to selectively remove specific knowledge from a model.
Current methods rely on fine-tuning models on the forget set, generating a task vector, and subtracting it from the original model.
We propose a novel method that leverages all given fine-tuned models rather than selecting a single one.
arXiv Detail & Related papers (2024-10-08T00:50:54Z) - Merging Multi-Task Models via Weight-Ensembling Mixture of Experts [64.94129594112557]
Merging Transformer-based models trained on different tasks into a single unified model can execute all the tasks concurrently.
Previous methods, exemplified by task arithmetic, have been proven to be both effective and scalable.
We propose to merge most of the parameters while upscaling the Transformer layers to a weight-ensembling mixture of experts (MoE) module.
arXiv Detail & Related papers (2024-02-01T08:58:57Z) - Concrete Subspace Learning based Interference Elimination for Multi-task
Model Fusion [86.6191592951269]
Merging models fine-tuned from common extensively pretrained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multitask model that performs well across diverse tasks.
We propose the CONtinuous relaxation dis (Concrete) subspace learning method to identify a common lowdimensional subspace and utilize its shared information track interference problem without sacrificing performance.
arXiv Detail & Related papers (2023-12-11T07:24:54Z) - Integrating LLMs and Decision Transformers for Language Grounded
Generative Quality-Diversity [0.0]
Quality-Diversity is a branch of optimization that is often applied to problems from the Reinforcement Learning and control domains.
We propose a Large Language Model to augment the repertoire with natural language descriptions of trajectories.
We also propose an LLM-based approach to evaluating the performance of such generative agents.
arXiv Detail & Related papers (2023-08-25T10:00:06Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Learning Models as Functionals of Signed-Distance Fields for
Manipulation Planning [51.74463056899926]
This work proposes an optimization-based manipulation planning framework where the objectives are learned functionals of signed-distance fields that represent objects in the scene.
We show that representing objects as signed-distance fields not only enables to learn and represent a variety of models with higher accuracy compared to point-cloud and occupancy measure representations.
arXiv Detail & Related papers (2021-10-02T12:36:58Z) - Active Preference Learning using Maximum Regret [10.317601896290467]
We study active preference learning as a framework for intuitively specifying the behaviour of autonomous robots.
In active preference learning, a user chooses the preferred behaviour from a set of alternatives, from which the robot learns the user's preferences.
arXiv Detail & Related papers (2020-05-08T14:31:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.