Options of Interest: Temporal Abstraction with Interest Functions
- URL: http://arxiv.org/abs/2001.00271v1
- Date: Wed, 1 Jan 2020 21:24:39 GMT
- Title: Options of Interest: Temporal Abstraction with Interest Functions
- Authors: Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert,
Pierre-Luc Bacon, Doina Precup
- Abstract summary: We provide a generalization of initiation sets suitable for general function approximation, by defining an interest function associated with an option.
We derive a gradient-based learning algorithm for interest functions, leading to a new interest-option-critic architecture.
- Score: 58.30081828754683
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Temporal abstraction refers to the ability of an agent to use behaviours of
controllers which act for a limited, variable amount of time. The options
framework describes such behaviours as consisting of a subset of states in
which they can initiate, an internal policy and a stochastic termination
condition. However, much of the subsequent work on option discovery has ignored
the initiation set, because of difficulty in learning it from data. We provide
a generalization of initiation sets suitable for general function
approximation, by defining an interest function associated with an option. We
derive a gradient-based learning algorithm for interest functions, leading to a
new interest-option-critic architecture. We investigate how interest functions
can be leveraged to learn interpretable and reusable temporal abstractions. We
demonstrate the efficacy of the proposed approach through quantitative and
qualitative results, in both discrete and continuous environments.
Related papers
- Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Leveraging Prior Knowledge in Reinforcement Learning via Double-Sided
Bounds on the Value Function [4.48890356952206]
We show how an arbitrary approximation for the value function can be used to derive double-sided bounds on the optimal value function of interest.
We extend the framework with error analysis for continuous state and action spaces.
arXiv Detail & Related papers (2023-02-19T21:47:24Z) - Sequential Decision Making on Unmatched Data using Bayesian Kernel
Embeddings [10.75801980090826]
We propose a novel algorithm for maximizing the expectation of a function.
We take into consideration the uncertainty derived from the estimation of both the conditional distribution of the features and the unknown function.
Our algorithm empirically outperforms the current state-of-the-art algorithm in the experiments conducted.
arXiv Detail & Related papers (2022-10-25T01:27:29Z) - Attention Option-Critic [56.50123642237106]
We propose an attention-based extension to the option-critic framework.
We show that this leads to behaviorally diverse options which are also capable of state abstraction.
We also demonstrate the more efficient, interpretable, and reusable nature of the learned options in comparison with option-critic.
arXiv Detail & Related papers (2022-01-07T18:44:28Z) - Diversity-Enriched Option-Critic [47.82697599507171]
We show that our proposed method is capable of learning options end-to-end on several discrete and continuous control tasks.
Our approach generates robust, reusable, reliable and interpretable options, in contrast to option-critic.
arXiv Detail & Related papers (2020-11-04T22:12:54Z) - Deep Inverse Q-learning with Constraints [15.582910645906145]
We introduce a novel class of algorithms that only needs to solve the MDP underlying the demonstrated behavior once to recover the expert policy.
We show how to extend this class of algorithms to continuous state-spaces via function approximation and how to estimate a corresponding action-value function.
We evaluate the resulting algorithms called Inverse Action-value Iteration, Inverse Q-learning and Deep Inverse Q-learning on the Objectworld benchmark.
arXiv Detail & Related papers (2020-08-04T17:21:51Z) - Inferring Temporal Compositions of Actions Using Probabilistic Automata [61.09176771931052]
We propose to express temporal compositions of actions as semantic regular expressions and derive an inference framework using probabilistic automata.
Our approach is different from existing works that either predict long-range complex activities as unordered sets of atomic actions, or retrieve videos using natural language sentences.
arXiv Detail & Related papers (2020-04-28T00:15:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.