Generalized Inverse Planning: Learning Lifted non-Markovian Utility for
Generalizable Task Representation
- URL: http://arxiv.org/abs/2011.09854v1
- Date: Thu, 12 Nov 2020 21:06:26 GMT
- Title: Generalized Inverse Planning: Learning Lifted non-Markovian Utility for
Generalizable Task Representation
- Authors: Sirui Xie and Feng Gao and Song-Chun Zhu
- Abstract summary: In this work, we study learning such utility from human demonstrations.
We propose a new quest, Generalized Inverse Planning, for utility learning in this domain.
We outline a computational framework, Maximum Entropy Inverse Planning (MEIP), that learns non-Markovian utility and associated concepts in a generative manner.
- Score: 83.55414555337154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In searching for a generalizable representation of temporally extended tasks,
we spot two necessary constituents: the utility needs to be non-Markovian to
transfer temporal relations invariant to a probability shift, the utility also
needs to be lifted to abstract out specific grounding objects. In this work, we
study learning such utility from human demonstrations. While inverse
reinforcement learning (IRL) has been accepted as a general framework of
utility learning, its fundamental formulation is one concrete Markov Decision
Process. Thus the learned reward function does not specify the task
independently of the environment. Going beyond that, we define a domain of
generalization that spans a set of planning problems following a schema. We
hence propose a new quest, Generalized Inverse Planning, for utility learning
in this domain. We further outline a computational framework, Maximum Entropy
Inverse Planning (MEIP), that learns non-Markovian utility and associated
concepts in a generative manner. The learned utility and concepts form a task
representation that generalizes regardless of probability shift or structural
change. Seeing that the proposed generalization problem has not been widely
studied yet, we carefully define an evaluation protocol, with which we
illustrate the effectiveness of MEIP on two proof-of-concept domains and one
challenging task: learning to fold from demonstrations.
Related papers
- Disentangling Representations through Multi-task Learning [0.0]
We provide experimental and theoretical results guaranteeing the emergence of disentangled representations in agents that optimally solve classification tasks.
We experimentally validate these predictions in RNNs trained on multi-task classification.
We find that transformers are particularly suited for disentangling representations, which might explain their unique world understanding abilities.
arXiv Detail & Related papers (2024-07-15T21:32:58Z) - Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning [83.41487567765871]
Skipper is a model-based reinforcement learning framework.
It automatically generalizes the task given into smaller, more manageable subtasks.
It enables sparse decision-making and focused abstractions on the relevant parts of the environment.
arXiv Detail & Related papers (2023-09-30T02:25:18Z) - Leveraging sparse and shared feature activations for disentangled
representation learning [112.22699167017471]
We propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common disentangled representation.
We validate our approach on six real world distribution shift benchmarks, and different data modalities.
arXiv Detail & Related papers (2023-04-17T01:33:24Z) - Synergies between Disentanglement and Sparsity: Generalization and
Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations.
Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z) - Discovering Generalizable Spatial Goal Representations via Graph-based
Active Reward Learning [17.58129740811116]
We propose a reward learning approach, Graph-based Equivalence Mappings (GEM)
GEM represents a spatial goal specification by a reward function conditioned on i) a graph indicating important spatial relationships between objects and ii) state equivalence mappings for each edge in the graph.
We show that GEM can drastically improve the generalizability of the learned goal representations over strong baselines.
arXiv Detail & Related papers (2022-11-24T18:59:06Z) - Inferring Versatile Behavior from Demonstrations by Matching Geometric
Descriptors [72.62423312645953]
Humans intuitively solve tasks in versatile ways, varying their behavior in terms of trajectory-based planning and for individual steps.
Current Imitation Learning algorithms often only consider unimodal expert demonstrations and act in a state-action-based setting.
Instead, we combine a mixture of movement primitives with a distribution matching objective to learn versatile behaviors that match the expert's behavior and versatility.
arXiv Detail & Related papers (2022-10-17T16:42:59Z) - Evolving Domain Generalization [14.072505551647813]
We formulate and study the emphevolving domain generalization (EDG) scenario, which exploits not only the source data but also their evolving pattern to generate a model for the unseen task.
Our theoretical result reveals the benefits of modeling the relation between two consecutive tasks by learning a globally consistent directional mapping function.
In practice, our analysis also suggests solving the DDG problem in a meta-learning manner, which leads to emphdirectional network, the first method for the DDG problem.
arXiv Detail & Related papers (2022-05-31T18:28:15Z) - Provably Efficient Causal Model-Based Reinforcement Learning for
Systematic Generalization [30.456180468318305]
In the sequential decision making setting, an agent aims to achieve systematic generalization over a large, possibly infinite, set of environments.
In this paper, we provide a tractable formulation of systematic generalization by employing a causal viewpoint.
Under specific structural assumptions, we provide a simple learning algorithm that guarantees any desired planning error up to an unavoidable sub-optimality term.
arXiv Detail & Related papers (2022-02-14T08:34:51Z) - DisCo RL: Distribution-Conditioned Reinforcement Learning for
General-Purpose Policies [116.12670064963625]
We develop an off-policy algorithm called distribution-conditioned reinforcement learning (DisCo RL) to efficiently learn contextual policies.
We evaluate DisCo RL on a variety of robot manipulation tasks and find that it significantly outperforms prior methods on tasks that require generalization to new goal distributions.
arXiv Detail & Related papers (2021-04-23T16:51:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.