Finding Useful Predictions by Meta-gradient Descent to Improve
Decision-making
- URL: http://arxiv.org/abs/2111.11212v1
- Date: Thu, 18 Nov 2021 20:17:07 GMT
- Title: Finding Useful Predictions by Meta-gradient Descent to Improve
Decision-making
- Authors: Alex Kearney, Anna Koop, Johannes G\"unther, Patrick M. Pilarski
- Abstract summary: We focus on predictions expressed as General Value Functions: temporally extended estimates of the accumulation of a future signal.
One challenge is determining from the infinitely many predictions that the agent could possibly make which might support decision-making.
By learning, rather than manually specifying these predictions, we enable the agent to identify useful predictions in a self-supervised manner.
- Score: 1.384055225262046
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In computational reinforcement learning, a growing body of work seeks to
express an agent's model of the world through predictions about future
sensations. In this manuscript we focus on predictions expressed as General
Value Functions: temporally extended estimates of the accumulation of a future
signal. One challenge is determining from the infinitely many predictions that
the agent could possibly make which might support decision-making. In this
work, we contribute a meta-gradient descent method by which an agent can
directly specify what predictions it learns, independent of designer
instruction. To that end, we introduce a partially observable domain suited to
this investigation. We then demonstrate that through interaction with the
environment an agent can independently select predictions that resolve the
partial-observability, resulting in performance similar to expertly chosen
value functions. By learning, rather than manually specifying these
predictions, we enable the agent to identify useful predictions in a
self-supervised manner, taking a step towards truly autonomous systems.
Related papers
- Performative Prediction on Games and Mechanism Design [69.7933059664256]
We study a collective risk dilemma where agents decide whether to trust predictions based on past accuracy.
As predictions shape collective outcomes, social welfare arises naturally as a metric of concern.
We show how to achieve better trade-offs and use them for mechanism design.
arXiv Detail & Related papers (2024-08-09T16:03:44Z) - What Should I Know? Using Meta-gradient Descent for Predictive Feature
Discovery in a Single Stream of Experience [63.75363908696257]
computational reinforcement learning seeks to construct an agent's perception of the world through predictions of future sensations.
An open challenge in this line of work is determining from the infinitely many predictions that the agent could possibly make which predictions might best support decision-making.
We introduce a meta-gradient descent process by which an agent learns what predictions to make, 2) the estimates for its chosen predictions, and 3) how to use those estimates to generate policies that maximize future reward.
arXiv Detail & Related papers (2022-06-13T21:31:06Z) - Why Did This Model Forecast This Future? Closed-Form Temporal Saliency
Towards Causal Explanations of Probabilistic Forecasts [20.442850522575213]
We build upon a general definition of information-theoretic saliency grounded in human perception.
We propose to express the saliency of an observed window in terms of the differential entropy of the resulting predicted future distribution.
We empirically demonstrate how our framework can recover salient observed windows from head pose features for the sample task of speaking-turn forecasting.
arXiv Detail & Related papers (2022-06-01T18:00:04Z) - You Mostly Walk Alone: Analyzing Feature Attribution in Trajectory
Prediction [52.442129609979794]
Recent deep learning approaches for trajectory prediction show promising performance.
It remains unclear which features such black-box models actually learn to use for making predictions.
This paper proposes a procedure that quantifies the contributions of different cues to model performance.
arXiv Detail & Related papers (2021-10-11T14:24:15Z) - Deceptive Decision-Making Under Uncertainty [25.197098169762356]
We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks.
By modeling the agent's behavior as a Markov decision process, we consider a setting where the agent aims to reach one of multiple potential goals.
We propose a novel approach to model observer predictions based on the principle of maximum entropy and to efficiently generate deceptive strategies.
arXiv Detail & Related papers (2021-09-14T14:56:23Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - Heterogeneous-Agent Trajectory Forecasting Incorporating Class
Uncertainty [54.88405167739227]
We present HAICU, a method for heterogeneous-agent trajectory forecasting that explicitly incorporates agents' class probabilities.
We additionally present PUP, a new challenging real-world autonomous driving dataset.
We demonstrate that incorporating class probabilities in trajectory forecasting significantly improves performance in the face of uncertainty.
arXiv Detail & Related papers (2021-04-26T10:28:34Z) - On complementing end-to-end human motion predictors with planning [31.025766804649464]
High capacity end-to-end approaches for human motion prediction have the ability to represent subtle nuances in human behavior, but struggle with robustness to out of distribution inputs and tail events.
Planning-based prediction, on the other hand, can reliably output decent-but-not-great predictions.
arXiv Detail & Related papers (2021-03-09T19:02:45Z) - When Does Uncertainty Matter?: Understanding the Impact of Predictive
Uncertainty in ML Assisted Decision Making [68.19284302320146]
We carry out user studies to assess how people with differing levels of expertise respond to different types of predictive uncertainty.
We found that showing posterior predictive distributions led to smaller disagreements with the ML model's predictions.
This suggests that posterior predictive distributions can potentially serve as useful decision aids which should be used with caution and take into account the type of distribution and the expertise of the human.
arXiv Detail & Related papers (2020-11-12T02:23:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.