Related papers: Options of Interest: Temporal Abstraction with Interest Functions

Options of Interest: Temporal Abstraction with Interest Functions

URL: http://arxiv.org/abs/2001.00271v1
Date: Wed, 1 Jan 2020 21:24:39 GMT
Title: Options of Interest: Temporal Abstraction with Interest Functions
Authors: Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, Pierre-Luc Bacon, Doina Precup
Abstract summary: We provide a generalization of initiation sets suitable for general function approximation, by defining an interest function associated with an option. We derive a gradient-based learning algorithm for interest functions, leading to a new interest-option-critic architecture.
Score: 58.30081828754683
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Temporal abstraction refers to the ability of an agent to use behaviours of controllers which act for a limited, variable amount of time. The options framework describes such behaviours as consisting of a subset of states in which they can initiate, an internal policy and a stochastic termination condition. However, much of the subsequent work on option discovery has ignored the initiation set, because of difficulty in learning it from data. We provide a generalization of initiation sets suitable for general function approximation, by defining an interest function associated with an option. We derive a gradient-based learning algorithm for interest functions, leading to a new interest-option-critic architecture. We investigate how interest functions can be leveraged to learn interpretable and reusable temporal abstractions. We demonstrate the efficacy of the proposed approach through quantitative and qualitative results, in both discrete and continuous environments.

Related papers

Autonomous Option Invention for Continual Hierarchical Reinforcement Learning and Planning [21.737035951695887]
This paper presents a novel approach for inventing, representing, and utilizing options. Our approach addresses problems characterized by long horizons, sparse rewards, and unknown transition and reward functions. Our main contributions are approaches for continually learning transferable, generalizable options with symbolic representations.
arXiv Detail & Related papers (2024-12-20T23:04:52Z)
Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data. We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures. We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z)
Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation. Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions. We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z)
Leveraging Prior Knowledge in Reinforcement Learning via Double-Sided Bounds on the Value Function [4.48890356952206]
We show how an arbitrary approximation for the value function can be used to derive double-sided bounds on the optimal value function of interest. We extend the framework with error analysis for continuous state and action spaces.
arXiv Detail & Related papers (2023-02-19T21:47:24Z)
Sequential Decision Making on Unmatched Data using Bayesian Kernel Embeddings [10.75801980090826]
We propose a novel algorithm for maximizing the expectation of a function. We take into consideration the uncertainty derived from the estimation of both the conditional distribution of the features and the unknown function. Our algorithm empirically outperforms the current state-of-the-art algorithm in the experiments conducted.
arXiv Detail & Related papers (2022-10-25T01:27:29Z)
Attention Option-Critic [56.50123642237106]
We propose an attention-based extension to the option-critic framework. We show that this leads to behaviorally diverse options which are also capable of state abstraction. We also demonstrate the more efficient, interpretable, and reusable nature of the learned options in comparison with option-critic.
arXiv Detail & Related papers (2022-01-07T18:44:28Z)
Diversity-Enriched Option-Critic [47.82697599507171]
We show that our proposed method is capable of learning options end-to-end on several discrete and continuous control tasks. Our approach generates robust, reusable, reliable and interpretable options, in contrast to option-critic.
arXiv Detail & Related papers (2020-11-04T22:12:54Z)
Deep Inverse Q-learning with Constraints [15.582910645906145]
We introduce a novel class of algorithms that only needs to solve the MDP underlying the demonstrated behavior once to recover the expert policy. We show how to extend this class of algorithms to continuous state-spaces via function approximation and how to estimate a corresponding action-value function. We evaluate the resulting algorithms called Inverse Action-value Iteration, Inverse Q-learning and Deep Inverse Q-learning on the Objectworld benchmark.
arXiv Detail & Related papers (2020-08-04T17:21:51Z)
Inferring Temporal Compositions of Actions Using Probabilistic Automata [61.09176771931052]
We propose to express temporal compositions of actions as semantic regular expressions and derive an inference framework using probabilistic automata. Our approach is different from existing works that either predict long-range complex activities as unordered sets of atomic actions, or retrieve videos using natural language sentences.
arXiv Detail & Related papers (2020-04-28T00:15:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.