Temporally Abstract Partial Models
- URL: http://arxiv.org/abs/2108.03213v1
- Date: Fri, 6 Aug 2021 17:26:21 GMT
- Title: Temporally Abstract Partial Models
- Authors: Khimya Khetarpal, Zafarali Ahmed, Gheorghe Comanici, Doina Precup
- Abstract summary: We develop temporally abstract partial option models that take into account the fact that an option might be affordable only in certain situations.
We analyze the trade-offs between estimation and approximation error in planning and learning when using such models.
- Score: 62.12485855601448
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans and animals have the ability to reason and make predictions about
different courses of action at many time scales. In reinforcement learning,
option models (Sutton, Precup \& Singh, 1999; Precup, 2000) provide the
framework for this kind of temporally abstract prediction and reasoning.
Natural intelligent agents are also able to focus their attention on courses of
action that are relevant or feasible in a given situation, sometimes termed
affordable actions. In this paper, we define a notion of affordances for
options, and develop temporally abstract partial option models, that take into
account the fact that an option might be affordable only in certain situations.
We analyze the trade-offs between estimation and approximation error in
planning and learning when using such models, and identify some interesting
special cases. Additionally, we demonstrate empirically the potential impact of
partial option models on the efficiency of planning.
Related papers
- On the Efficient Marginalization of Probabilistic Sequence Models [3.5897534810405403]
This dissertation focuses on using autoregressive models to answer complex probabilistic queries.
We develop a class of novel and efficient approximation techniques for marginalization in sequential models that are model-agnostic.
arXiv Detail & Related papers (2024-03-06T19:29:08Z) - Limits of Model Selection under Transfer Learning [18.53111473571927]
We study the transfer distance between source and target distributions, which is known to vary with the choice of hypothesis class.
In particular, the analysis reveals some remarkable phenomena: adaptive rates, i.e., those with no distributional information, can be arbitrarily slower than oracle rates.
arXiv Detail & Related papers (2023-04-29T02:27:42Z) - Minimal Value-Equivalent Partial Models for Scalable and Robust Planning
in Lifelong Reinforcement Learning [56.50123642237106]
Common practice in model-based reinforcement learning is to learn models that model every aspect of the agent's environment.
We argue that such models are not particularly well-suited for performing scalable and robust planning in lifelong reinforcement learning scenarios.
We propose new kinds of models that only model the relevant aspects of the environment, which we call "minimal value-minimal partial models"
arXiv Detail & Related papers (2023-01-24T16:40:01Z) - Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem.
The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z) - A Tale Of Two Long Tails [4.970364068620608]
We identify examples the model is uncertain about and characterize the source of said uncertainty.
We investigate whether the rate of learning in the presence of additional information differs between atypical and noisy examples.
Our results show that well-designed interventions over the course of training can be an effective way to characterize and distinguish between different sources of uncertainty.
arXiv Detail & Related papers (2021-07-27T22:49:59Z) - Thief, Beware of What Get You There: Towards Understanding Model
Extraction Attack [13.28881502612207]
In some scenarios, AI models are trained proprietarily, where neither pre-trained models nor sufficient in-distribution data is publicly available.
We find the effectiveness of existing techniques significantly affected by the absence of pre-trained models.
We formulate model extraction attacks into an adaptive framework that captures these factors with deep reinforcement learning.
arXiv Detail & Related papers (2021-04-13T03:46:59Z) - Just Label What You Need: Fine-Grained Active Selection for Perception
and Prediction through Partially Labeled Scenes [78.23907801786827]
We introduce generalizations that ensure that our approach is both cost-aware and allows for fine-grained selection of examples through partially labeled scenes.
Our experiments on a real-world, large-scale self-driving dataset suggest that fine-grained selection can improve the performance across perception, prediction, and downstream planning tasks.
arXiv Detail & Related papers (2021-04-08T17:57:41Z) - Forethought and Hindsight in Credit Assignment [62.05690959741223]
We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models.
We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
arXiv Detail & Related papers (2020-10-26T16:00:47Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.