Related papers: Explaining an Agent's Future Beliefs through Temporally Decomposing Future Reward Estimators

Explaining an Agent's Future Beliefs through Temporally Decomposing Future Reward Estimators

URL: http://arxiv.org/abs/2408.08230v1
Date: Thu, 15 Aug 2024 15:56:15 GMT
Title: Explaining an Agent's Future Beliefs through Temporally Decomposing Future Reward Estimators
Authors: Mark Towers, Yali Du, Christopher Freeman, Timothy J. Norman,
Abstract summary: We modify an agent's future reward estimator to predict their next N expected rewards, referred to as Temporal Reward Decomposition (TRD) We can: estimate when an agent may expect to receive a reward, the value of the reward and the agent's confidence in receiving it; measure an input feature's temporal importance to the agent's action decisions; and predict the influence of different actions on future rewards. We show that DQN agents trained on Atari environments can be efficiently retrained to incorporate TRD with minimal impact on performance.
Score: 5.642469620531317
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Future reward estimation is a core component of reinforcement learning agents; i.e., Q-value and state-value functions, predicting an agent's sum of future rewards. Their scalar output, however, obfuscates when or what individual future rewards an agent may expect to receive. We address this by modifying an agent's future reward estimator to predict their next N expected rewards, referred to as Temporal Reward Decomposition (TRD). This unlocks novel explanations of agent behaviour. Through TRD we can: estimate when an agent may expect to receive a reward, the value of the reward and the agent's confidence in receiving it; measure an input feature's temporal importance to the agent's action decisions; and predict the influence of different actions on future rewards. Furthermore, we show that DQN agents trained on Atari environments can be efficiently retrained to incorporate TRD with minimal impact on performance.

Related papers

The Value of Reward Lookahead in Reinforcement Learning [26.319716324907198]
We analyze the value of future reward information through the lens of competitive analysis. We characterize the worst-case reward distribution and derive exact ratios for the worst-case reward expectations. Our results cover the full spectrum between observing the immediate rewards before acting to observing all the rewards before the interaction starts.
arXiv Detail & Related papers (2024-03-18T10:19:52Z)
Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits with Strategic Agents [57.627352949446625]
We consider a variant of the multi-armed bandit problem. Specifically, the arms are strategic agents who can improve their rewards or absorb them. We identify a class of MAB algorithms which satisfy a collection of properties and show that they lead to mechanisms that incentivize top level performance at equilibrium.
arXiv Detail & Related papers (2023-12-13T06:54:49Z)
Estimating and Incentivizing Imperfect-Knowledge Agents with Hidden Rewards [4.742123770879715]
In practice, incentive providers often cannot observe the reward realizations of incentivized agents. This paper explores a repeated adverse selection game between a self-interested learning agent and a learning principal. We introduce an estimator whose only input is the history of principal's incentives and agent's choices.
arXiv Detail & Related papers (2023-08-13T08:12:01Z)
Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis [50.926791529605396]
We introduce Counterfactual Contribution Analysis (COCOA), a new family of model-based credit assignment algorithms. Our algorithms achieve precise credit assignment by measuring the contribution of actions upon obtaining subsequent rewards.
arXiv Detail & Related papers (2023-06-29T09:27:27Z)
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark [61.43264961005614]
We develop a benchmark of 134 Choose-Your-Own-Adventure games containing over half a million rich, diverse scenarios. We evaluate agents' tendencies to be power-seeking, cause disutility, and commit ethical violations. Our results show that agents can both act competently and morally, so concrete progress can be made in machine ethics.
arXiv Detail & Related papers (2023-04-06T17:59:03Z)
A Neural Active Inference Model of Perceptual-Motor Learning [62.39667564455059]
The active inference framework (AIF) is a promising new computational framework grounded in contemporary neuroscience. In this study, we test the ability for the AIF to capture the role of anticipation in the visual guidance of action in humans. We present a novel formulation of the prior function that maps a multi-dimensional world-state to a uni-dimensional distribution of free-energy.
arXiv Detail & Related papers (2022-11-16T20:00:38Z)
Distributional Reward Estimation for Effective Multi-Agent Deep Reinforcement Learning [19.788336796981685]
We propose a novel Distributional Reward Estimation framework for effective Multi-Agent Reinforcement Learning (DRE-MARL) Our main idea is to design the multi-action-branch reward estimation and policy-weighted reward aggregation for stabilized training. The superiority of the DRE-MARL is demonstrated using benchmark multi-agent scenarios, compared with the SOTA baselines in terms of both effectiveness and robustness.
arXiv Detail & Related papers (2022-10-14T08:31:45Z)
What Should I Know? Using Meta-gradient Descent for Predictive Feature Discovery in a Single Stream of Experience [63.75363908696257]
computational reinforcement learning seeks to construct an agent's perception of the world through predictions of future sensations. An open challenge in this line of work is determining from the infinitely many predictions that the agent could possibly make which predictions might best support decision-making. We introduce a meta-gradient descent process by which an agent learns what predictions to make, 2) the estimates for its chosen predictions, and 3) how to use those estimates to generate policies that maximize future reward.
arXiv Detail & Related papers (2022-06-13T21:31:06Z)
Experimental Evidence that Empowerment May Drive Exploration in Sparse-Reward Environments [0.0]
An intrinsic reward function based on the principle of empowerment assigns rewards proportional to the amount of control the agent has over its own sensors. We implement a variation on a recently proposed intrinsically motivated agent, which we refer to as the 'curious' agent, and an empowerment-inspired agent. We compare the performance of both agents to that of an advantage actor-critic baseline in four sparse reward grid worlds.
arXiv Detail & Related papers (2021-07-14T22:52:38Z)
Heterogeneous-Agent Trajectory Forecasting Incorporating Class Uncertainty [54.88405167739227]
We present HAICU, a method for heterogeneous-agent trajectory forecasting that explicitly incorporates agents' class probabilities. We additionally present PUP, a new challenging real-world autonomous driving dataset. We demonstrate that incorporating class probabilities in trajectory forecasting significantly improves performance in the face of uncertainty.
arXiv Detail & Related papers (2021-04-26T10:28:34Z)
Maximizing Information Gain in Partially Observable Environments via Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent. We derive the exact error between negative entropy and the expected prediction reward. This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.