Forethought and Hindsight in Credit Assignment
- URL: http://arxiv.org/abs/2010.13685v1
- Date: Mon, 26 Oct 2020 16:00:47 GMT
- Title: Forethought and Hindsight in Credit Assignment
- Authors: Veronica Chelu, Doina Precup, Hado van Hasselt
- Abstract summary: We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models.
We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
- Score: 62.05690959741223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the problem of credit assignment in reinforcement learning and
explore fundamental questions regarding the way in which an agent can best use
additional computation to propagate new information, by planning with internal
models of the world to improve its predictions. Particularly, we work to
understand the gains and peculiarities of planning employed as forethought via
forward models or as hindsight operating with backward models. We establish the
relative merits, limitations and complementary properties of both planning
mechanisms in carefully constructed scenarios. Further, we investigate the best
use of models in planning, primarily focusing on the selection of states in
which predictions should be (re)-evaluated. Lastly, we discuss the issue of
model estimation and highlight a spectrum of methods that stretch from explicit
environment-dynamics predictors to more abstract planner-aware models.
Related papers
- Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Model Complexity of Program Phases [0.5439020425818999]
In resource limited computing systems, sequence prediction models must operate under tight constraints.
Various models are available that cater to prediction under these conditions that in some way focus on reducing the cost of implementation.
These resource constrained sequence prediction models, in practice, exhibit a fundamental tradeoff between the cost of implementation and the quality of its predictions.
arXiv Detail & Related papers (2023-10-05T19:50:15Z) - Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters.
EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z) - Minimal Value-Equivalent Partial Models for Scalable and Robust Planning
in Lifelong Reinforcement Learning [56.50123642237106]
Common practice in model-based reinforcement learning is to learn models that model every aspect of the agent's environment.
We argue that such models are not particularly well-suited for performing scalable and robust planning in lifelong reinforcement learning scenarios.
We propose new kinds of models that only model the relevant aspects of the environment, which we call "minimal value-minimal partial models"
arXiv Detail & Related papers (2023-01-24T16:40:01Z) - A review of predictive uncertainty estimation with machine learning [0.0]
We review the topic of predictive uncertainty estimation with machine learning algorithms.
We discuss the related metrics (consistent scoring functions and proper scoring rules) for assessing probabilistic predictions.
The review expedites our understanding on how to develop new algorithms tailored to users' needs.
arXiv Detail & Related papers (2022-09-17T10:36:30Z) - Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem.
The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z) - Counterfactual Plans under Distributional Ambiguity [12.139222986297263]
We study the counterfactual plans under model uncertainty, in which the distribution of the model parameters is partially prescribed.
First, we propose an uncertainty quantification tool to compute the lower and upper bounds of the probability of validity for any given counterfactual plan.
We then provide corrective methods to adjust the counterfactual plan to improve the validity measure.
arXiv Detail & Related papers (2022-01-29T03:41:47Z) - A Consciousness-Inspired Planning Agent for Model-Based Reinforcement
Learning [104.3643447579578]
We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state.
The design allows agents to learn to plan effectively, by attending to the relevant objects, leading to better out-of-distribution generalization.
arXiv Detail & Related papers (2021-06-03T19:35:19Z) - Goal-Directed Planning for Habituated Agents by Active Inference Using a
Variational Recurrent Neural Network [5.000272778136268]
This study shows that the predictive coding (PC) and active inference (AIF) frameworks can develop better generalization by learning a prior distribution in a low dimensional latent state space.
In our proposed model, learning is carried out by inferring optimal latent variables as well as synaptic weights for maximizing the evidence lower bound.
Our proposed model was evaluated with both simple and complex robotic tasks in simulation, which demonstrated sufficient generalization in learning with limited training data.
arXiv Detail & Related papers (2020-05-27T06:43:59Z) - Bootstrapped model learning and error correction for planning with
uncertainty in model-based RL [1.370633147306388]
A natural aim is to learn a model that reflects accurately the dynamics of the environment.
This paper explores the problem of model misspecification through uncertainty-aware reinforcement learning agents.
We propose a bootstrapped multi-headed neural network that learns the distribution of future states and rewards.
arXiv Detail & Related papers (2020-04-15T15:41:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.