Multi Task Inverse Reinforcement Learning for Common Sense Reward
- URL: http://arxiv.org/abs/2402.11367v1
- Date: Sat, 17 Feb 2024 19:49:00 GMT
- Title: Multi Task Inverse Reinforcement Learning for Common Sense Reward
- Authors: Neta Glazer, Aviv Navon, Aviv Shamsian, Ethan Fetaya
- Abstract summary: We show that inverse reinforcement learning, even when it succeeds in training an agent, does not learn a useful reward function.
That is, training a new agent with the learned reward does not impair the desired behaviors.
That is, multi-task inverse reinforcement learning can be applied to learn a useful reward function.
- Score: 21.145179791929337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the challenges in applying reinforcement learning in a complex
real-world environment lies in providing the agent with a sufficiently detailed
reward function. Any misalignment between the reward and the desired behavior
can result in unwanted outcomes. This may lead to issues like "reward hacking"
where the agent maximizes rewards by unintended behavior. In this work, we
propose to disentangle the reward into two distinct parts. A simple
task-specific reward, outlining the particulars of the task at hand, and an
unknown common-sense reward, indicating the expected behavior of the agent
within the environment. We then explore how this common-sense reward can be
learned from expert demonstrations. We first show that inverse reinforcement
learning, even when it succeeds in training an agent, does not learn a useful
reward function. That is, training a new agent with the learned reward does not
impair the desired behaviors. We then demonstrate that this problem can be
solved by training simultaneously on multiple tasks. That is, multi-task
inverse reinforcement learning can be applied to learn a useful reward
function.
Related papers
- Reward Shaping for Happier Autonomous Cyber Security Agents [0.276240219662896]
One of the most promising directions uses deep reinforcement learning to train autonomous agents in computer network defense tasks.
This work studies the impact of the reward signal that is provided to the agents when training for this task.
arXiv Detail & Related papers (2023-10-20T15:04:42Z) - Tiered Reward: Designing Rewards for Specification and Fast Learning of Desired Behavior [13.409265335314169]
Tiered Reward is a class of environment-independent reward functions.
We show it is guaranteed to induce policies that are optimal according to our preference relation.
arXiv Detail & Related papers (2022-12-07T15:55:00Z) - Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - Adversarial Motion Priors Make Good Substitutes for Complex Reward
Functions [124.11520774395748]
Reinforcement learning practitioners often utilize complex reward functions that encourage physically plausible behaviors.
We propose substituting complex reward functions with "style rewards" learned from a dataset of motion capture demonstrations.
A learned style reward can be combined with an arbitrary task reward to train policies that perform tasks using naturalistic strategies.
arXiv Detail & Related papers (2022-03-28T21:17:36Z) - On the Expressivity of Markov Reward [89.96685777114456]
This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform.
We frame this study around three new abstract notions of "task" that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories.
arXiv Detail & Related papers (2021-11-01T12:12:16Z) - Curious Exploration and Return-based Memory Restoration for Deep
Reinforcement Learning [2.3226893628361682]
In this paper, we focus on training a single agent to score goals with binary success/failure reward function.
The proposed method can be utilized to train agents in environments with fairly complex state and action spaces.
arXiv Detail & Related papers (2021-05-02T16:01:34Z) - Mutual Information State Intrinsic Control [91.38627985733068]
Intrinsically motivated RL attempts to remove this constraint by defining an intrinsic reward function.
Motivated by the self-consciousness concept in psychology, we make a natural assumption that the agent knows what constitutes itself.
We mathematically formalize this reward as the mutual information between the agent state and the surrounding state.
arXiv Detail & Related papers (2021-03-15T03:03:36Z) - Deceptive Reinforcement Learning for Privacy-Preserving Planning [8.950168559003991]
Reinforcement learning is the problem of finding a behaviour policy based on rewards received from exploratory behaviour.
A key ingredient in reinforcement learning is a reward function, which determines how much reward (negative or positive) is given and when.
We present two models for solving the problem of privacy-preserving reinforcement learning.
arXiv Detail & Related papers (2021-02-05T06:50:04Z) - Semi-supervised reward learning for offline reinforcement learning [71.6909757718301]
Training agents usually requires reward functions, but rewards are seldom available in practice and their engineering is challenging and laborious.
We propose semi-supervised learning algorithms that learn from limited annotations and incorporate unlabelled data.
In our experiments with a simulated robotic arm, we greatly improve upon behavioural cloning and closely approach the performance achieved with ground truth rewards.
arXiv Detail & Related papers (2020-12-12T20:06:15Z) - Pitfalls of learning a reward function online [28.2272248328398]
We consider a continual (one life'') learning approach where the agent both learns the reward function and optimises for it at the same time.
This comes with a number of pitfalls, such as deliberately manipulating the learning process in one direction.
We show that an uninfluenceable process is automatically unriggable, and if the set of possible environments is sufficiently rich, the converse is true too.
arXiv Detail & Related papers (2020-04-28T16:58:58Z) - Intrinsic Motivation for Encouraging Synergistic Behavior [55.10275467562764]
We study the role of intrinsic motivation as an exploration bias for reinforcement learning in sparse-reward synergistic tasks.
Our key idea is that a good guiding principle for intrinsic motivation in synergistic tasks is to take actions which affect the world in ways that would not be achieved if the agents were acting on their own.
arXiv Detail & Related papers (2020-02-12T19:34:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.