Pitfalls of learning a reward function online
- URL: http://arxiv.org/abs/2004.13654v1
- Date: Tue, 28 Apr 2020 16:58:58 GMT
- Title: Pitfalls of learning a reward function online
- Authors: Stuart Armstrong and Jan Leike and Laurent Orseau and Shane Legg
- Abstract summary: We consider a continual (one life'') learning approach where the agent both learns the reward function and optimises for it at the same time.
This comes with a number of pitfalls, such as deliberately manipulating the learning process in one direction.
We show that an uninfluenceable process is automatically unriggable, and if the set of possible environments is sufficiently rich, the converse is true too.
- Score: 28.2272248328398
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In some agent designs like inverse reinforcement learning an agent needs to
learn its own reward function. Learning the reward function and optimising for
it are typically two different processes, usually performed at different
stages. We consider a continual (``one life'') learning approach where the
agent both learns the reward function and optimises for it at the same time. We
show that this comes with a number of pitfalls, such as deliberately
manipulating the learning process in one direction, refusing to learn,
``learning'' facts already known to the agent, and making decisions that are
strictly dominated (for all relevant reward functions). We formally introduce
two desirable properties: the first is `unriggability', which prevents the
agent from steering the learning process in the direction of a reward function
that is easier to optimise. The second is `uninfluenceability', whereby the
reward-function learning process operates by learning facts about the
environment. We show that an uninfluenceable process is automatically
unriggable, and if the set of possible environments is sufficiently rich, the
converse is true too.
Related papers
- Multi Task Inverse Reinforcement Learning for Common Sense Reward [21.145179791929337]
We show that inverse reinforcement learning, even when it succeeds in training an agent, does not learn a useful reward function.
That is, training a new agent with the learned reward does not impair the desired behaviors.
That is, multi-task inverse reinforcement learning can be applied to learn a useful reward function.
arXiv Detail & Related papers (2024-02-17T19:49:00Z) - Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Curious Exploration and Return-based Memory Restoration for Deep
Reinforcement Learning [2.3226893628361682]
In this paper, we focus on training a single agent to score goals with binary success/failure reward function.
The proposed method can be utilized to train agents in environments with fairly complex state and action spaces.
arXiv Detail & Related papers (2021-05-02T16:01:34Z) - Replacing Rewards with Examples: Example-Based Policy Search via
Recursive Classification [133.20816939521941]
In the standard Markov decision process formalism, users specify tasks by writing down a reward function.
In many scenarios, the user is unable to describe the task in words or numbers, but can readily provide examples of what the world would look like if the task were solved.
Motivated by this observation, we derive a control algorithm that aims to visit states that have a high probability of leading to successful outcomes, given only examples of successful outcome states.
arXiv Detail & Related papers (2021-03-23T16:19:55Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Understanding Learned Reward Functions [6.714172005695389]
We investigate techniques for interpreting learned reward functions.
In particular, we apply saliency methods to identify failure modes and predict the robustness of reward functions.
We find that learned reward functions often implement surprising algorithms that rely on contingent aspects of the environment.
arXiv Detail & Related papers (2020-12-10T18:19:48Z) - Off-Policy Adversarial Inverse Reinforcement Learning [0.0]
Adversarial Imitation Learning (AIL) is a class of algorithms in Reinforcement learning (RL)
We propose an Off-Policy Adversarial Inverse Reinforcement Learning (Off-policy-AIRL) algorithm which is sample efficient as well as gives good imitation performance.
arXiv Detail & Related papers (2020-05-03T16:51:40Z) - Emergent Real-World Robotic Skills via Unsupervised Off-Policy
Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks.
We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible.
We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z) - Intrinsic Motivation for Encouraging Synergistic Behavior [55.10275467562764]
We study the role of intrinsic motivation as an exploration bias for reinforcement learning in sparse-reward synergistic tasks.
Our key idea is that a good guiding principle for intrinsic motivation in synergistic tasks is to take actions which affect the world in ways that would not be achieved if the agents were acting on their own.
arXiv Detail & Related papers (2020-02-12T19:34:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.