Universal Successor Features for Transfer Reinforcement Learning
- URL: http://arxiv.org/abs/2001.04025v1
- Date: Sun, 5 Jan 2020 03:41:06 GMT
- Title: Universal Successor Features for Transfer Reinforcement Learning
- Authors: Chen Ma, Dylan R. Ashley, Junfeng Wen, Yoshua Bengio
- Abstract summary: We propose Universal Successor Features (USFs) to capture the underlying dynamics of the environment.
We show that USFs is compatible with any RL algorithm that learns state values using a temporal difference method.
- Score: 77.27304854836645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer in Reinforcement Learning (RL) refers to the idea of applying
knowledge gained from previous tasks to solving related tasks. Learning a
universal value function (Schaul et al., 2015), which generalizes over goals
and states, has previously been shown to be useful for transfer. However,
successor features are believed to be more suitable than values for transfer
(Dayan, 1993; Barreto et al.,2017), even though they cannot directly generalize
to new goals. In this paper, we propose (1) Universal Successor Features (USFs)
to capture the underlying dynamics of the environment while allowing
generalization to unseen goals and (2) a flexible end-to-end model of USFs that
can be trained by interacting with the environment. We show that learning USFs
is compatible with any RL algorithm that learns state values using a temporal
difference method. Our experiments in a simple gridworld and with two MuJoCo
environments show that USFs can greatly accelerate training when learning
multiple tasks and can effectively transfer knowledge to new tasks.
Related papers
- Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - World Value Functions: Knowledge Representation for Learning and
Planning [14.731788603429774]
We propose world value functions (WVFs), a type of goal-oriented general value function.
WVFs represent how to solve not just a given task, but any other goal-reaching task in an agent's environment.
We show that WVFs can be learned faster than regular value functions, while their ability to infer the environment's dynamics can be used to integrate learning and planning methods.
arXiv Detail & Related papers (2022-06-23T18:49:54Z) - World Value Functions: Knowledge Representation for Multitask
Reinforcement Learning [14.731788603429774]
We propose world value functions (WVFs), which are a type of general value function with mastery of the world.
We equip the agent with an internal goal space defined as all the world states where it experiences a terminal transition.
We show that for tasks in the same world, a pretrained agent that has learned any WVF can then infer the policy and value function for any new task directly from its rewards.
arXiv Detail & Related papers (2022-05-18T09:45:14Z) - A Framework of Meta Functional Learning for Regularising Knowledge
Transfer [89.74127682599898]
This work proposes a novel framework of Meta Functional Learning (MFL) by meta-learning a generalisable functional model from data-rich tasks.
The MFL computes meta-knowledge on functional regularisation generalisable to different learning tasks by which functional training on limited labelled data promotes more discriminative functions to be learned.
arXiv Detail & Related papers (2022-03-28T15:24:09Z) - Omni-Training for Data-Efficient Deep Learning [80.28715182095975]
Recent advances reveal that a properly pre-trained model endows an important property: transferability.
A tight combination of pre-training and meta-training cannot achieve both kinds of transferability.
This motivates the proposed Omni-Training framework towards data-efficient deep learning.
arXiv Detail & Related papers (2021-10-14T16:30:36Z) - Fractional Transfer Learning for Deep Model-Based Reinforcement Learning [0.966840768820136]
Reinforcement learning (RL) is well known for requiring large amounts of data in order for RL agents to learn to perform complex tasks.
Recent progress in model-based RL allows agents to be much more data-efficient.
We present a simple alternative approach: fractional transfer learning.
arXiv Detail & Related papers (2021-08-14T12:44:42Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Meta-learning Transferable Representations with a Single Target Domain [46.83481356352768]
Fine-tuning and joint training do not always improve accuracy on downstream tasks.
We propose Meta Representation Learning (MeRLin) to learn transferable features.
MeRLin empirically outperforms previous state-of-the-art transfer learning algorithms on various real-world vision and NLP transfer learning benchmarks.
arXiv Detail & Related papers (2020-11-03T01:57:37Z) - Pre-trained Word Embeddings for Goal-conditional Transfer Learning in
Reinforcement Learning [0.0]
We show how a pre-trained task-independent language model can make a goal-conditional RL agent more sample efficient.
We do this by facilitating transfer learning between different related tasks.
arXiv Detail & Related papers (2020-07-10T06:42:00Z) - Uniform Priors for Data-Efficient Transfer [65.086680950871]
We show that features that are most transferable have high uniformity in the embedding space.
We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
arXiv Detail & Related papers (2020-06-30T04:39:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.