Informativeness of Reward Functions in Reinforcement Learning
- URL: http://arxiv.org/abs/2402.07019v1
- Date: Sat, 10 Feb 2024 18:36:42 GMT
- Title: Informativeness of Reward Functions in Reinforcement Learning
- Authors: Rati Devidze, Parameswaran Kamalaruban, Adish Singla
- Abstract summary: We study the problem of designing informative reward functions so that the designed rewards speed up the agent's convergence.
Existing works have considered several different reward design formulations.
We propose a reward informativeness criterion that adapts w.r.t. the agent's current policy and can be optimized under specified structural constraints.
- Score: 34.40155383189179
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reward functions are central in specifying the task we want a reinforcement
learning agent to perform. Given a task and desired optimal behavior, we study
the problem of designing informative reward functions so that the designed
rewards speed up the agent's convergence. In particular, we consider
expert-driven reward design settings where an expert or teacher seeks to
provide informative and interpretable rewards to a learning agent. Existing
works have considered several different reward design formulations; however,
the key challenge is formulating a reward informativeness criterion that adapts
w.r.t. the agent's current policy and can be optimized under specified
structural constraints to obtain interpretable rewards. In this paper, we
propose a novel reward informativeness criterion, a quantitative measure that
captures how the agent's current policy will improve if it receives rewards
from a specific reward function. We theoretically showcase the utility of the
proposed informativeness criterion for adaptively designing rewards for an
agent. Experimental results on two navigation tasks demonstrate the
effectiveness of our adaptive reward informativeness criterion.
Related papers
- Behavior Alignment via Reward Function Optimization [23.92721220310242]
We introduce a new framework that integrates auxiliary rewards reflecting a designer's domain knowledge with the environment's primary rewards.
We evaluate our method's efficacy on a diverse set of tasks, from small-scale experiments to high-dimensional control challenges.
arXiv Detail & Related papers (2023-10-29T13:45:07Z) - Unpacking Reward Shaping: Understanding the Benefits of Reward
Engineering on Sample Complexity [114.88145406445483]
Reinforcement learning provides an automated framework for learning behaviors from high-level reward specifications.
In practice the choice of reward function can be crucial for good results.
arXiv Detail & Related papers (2022-10-18T04:21:25Z) - Automatic Reward Design via Learning Motivation-Consistent Intrinsic
Rewards [46.068337522093096]
We introduce the concept of motivation which captures the underlying goal of maximizing certain rewards.
Our method performs better than the state-of-the-art methods in handling problems of delayed reward, exploration, and credit assignment.
arXiv Detail & Related papers (2022-07-29T14:52:02Z) - Admissible Policy Teaching through Reward Design [32.39785256112934]
We study reward design strategies for incentivizing a reinforcement learning agent to adopt a policy from a set of admissible policies.
The goal of the reward designer is to modify the underlying reward function cost-efficiently while ensuring that any approximately optimal deterministic policy under the new reward function is admissible.
arXiv Detail & Related papers (2022-01-06T18:49:57Z) - On the Expressivity of Markov Reward [89.96685777114456]
This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform.
We frame this study around three new abstract notions of "task" that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories.
arXiv Detail & Related papers (2021-11-01T12:12:16Z) - Information Directed Reward Learning for Reinforcement Learning [64.33774245655401]
We learn a model of the reward function that allows standard RL algorithms to achieve high expected return with as few expert queries as possible.
In contrast to prior active reward learning methods designed for specific types of queries, IDRL naturally accommodates different query types.
We support our findings with extensive evaluations in multiple environments and with different types of queries.
arXiv Detail & Related papers (2021-02-24T18:46:42Z) - Understanding Learned Reward Functions [6.714172005695389]
We investigate techniques for interpreting learned reward functions.
In particular, we apply saliency methods to identify failure modes and predict the robustness of reward functions.
We find that learned reward functions often implement surprising algorithms that rely on contingent aspects of the environment.
arXiv Detail & Related papers (2020-12-10T18:19:48Z) - Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping [71.214923471669]
Reward shaping is an effective technique for incorporating domain knowledge into reinforcement learning (RL)
In this paper, we consider the problem of adaptively utilizing a given shaping reward function.
Experiments in sparse-reward cartpole and MuJoCo environments show that our algorithms can fully exploit beneficial shaping rewards.
arXiv Detail & Related papers (2020-11-05T05:34:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.