Probing Transfer in Deep Reinforcement Learning without Task Engineering
- URL: http://arxiv.org/abs/2210.12448v1
- Date: Sat, 22 Oct 2022 13:40:12 GMT
- Title: Probing Transfer in Deep Reinforcement Learning without Task Engineering
- Authors: Andrei A. Rusu, Sebastian Flennerhag, Dushyant Rao, Razvan Pascanu,
Raia Hadsell
- Abstract summary: We evaluate the use of original game curricula supported by the Atari 2600 console as a heterogeneous transfer benchmark for deep reinforcement learning agents.
Game designers created curricula using combinations of several discrete modifications to the basic versions of games such as Space Invaders, Breakout and Freeway.
We show that zero-shot transfer from the basic games to their variations is possible, but the variance in performance is also largely explained by interactions between factors.
- Score: 26.637254541454773
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We evaluate the use of original game curricula supported by the Atari 2600
console as a heterogeneous transfer benchmark for deep reinforcement learning
agents. Game designers created curricula using combinations of several discrete
modifications to the basic versions of games such as Space Invaders, Breakout
and Freeway, making them progressively more challenging for human players. By
formally organising these modifications into several factors of variation, we
are able to show that Analyses of Variance (ANOVA) are a potent tool for
studying the effects of human-relevant domain changes on the learning and
transfer performance of a deep reinforcement learning agent. Since no manual
task engineering is needed on our part, leveraging the original multi-factorial
design avoids the pitfalls of unintentionally biasing the experimental setup.
We find that game design factors have a large and statistically significant
impact on an agent's ability to learn, and so do their combinatorial
interactions. Furthermore, we show that zero-shot transfer from the basic games
to their respective variations is possible, but the variance in performance is
also largely explained by interactions between factors. As such, we argue that
Atari game curricula offer a challenging benchmark for transfer learning in RL,
that can help the community better understand the generalisation capabilities
of RL agents along dimensions which meaningfully impact human generalisation
performance. As a start, we report that value-function finetuning of regularly
trained agents achieves positive transfer in a majority of cases, but
significant headroom for algorithmic innovation remains. We conclude with the
observation that selective transfer from multiple variants could further
improve performance.
Related papers
- Understanding the Role of Invariance in Transfer Learning [9.220104991339104]
Transfer learning is a powerful technique for knowledge-sharing between different tasks.
Recent work has found that the representations of models with certain invariances, such as to adversarial input perturbations, achieve higher performance on downstream tasks.
arXiv Detail & Related papers (2024-07-05T07:53:52Z) - Pixel to policy: DQN Encoders for within & cross-game reinforcement
learning [0.0]
Reinforcement Learning can be applied to various tasks, and environments.
Many environments have a similar structure, which can be exploited to improve RL performance on other tasks.
This work explores as well as compares the performance between RL models being trained from the scratch and on different approaches of transfer learning.
arXiv Detail & Related papers (2023-08-01T06:29:33Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - Learning to Optimize for Reinforcement Learning [58.01132862590378]
Reinforcement learning (RL) is essentially different from supervised learning, and in practice, these learneds do not work well even in simple RL tasks.
Agent-gradient distribution is non-independent and identically distributed, leading to inefficient meta-training.
We show that, although only trained in toy tasks, our learned can generalize unseen complex tasks in Brax.
arXiv Detail & Related papers (2023-02-03T00:11:02Z) - Multi-Game Decision Transformers [49.257185338595434]
We show that a single transformer-based model can play a suite of up to 46 Atari games simultaneously at close-to-human performance.
We compare several approaches in this multi-game setting, such as online and offline RL methods and behavioral cloning.
We find that our Multi-Game Decision Transformer models offer the best scalability and performance.
arXiv Detail & Related papers (2022-05-30T16:55:38Z) - Why Do Self-Supervised Models Transfer? Investigating the Impact of
Invariance on Downstream Tasks [79.13089902898848]
Self-supervised learning is a powerful paradigm for representation learning on unlabelled images.
We show that different tasks in computer vision require features to encode different (in)variances.
arXiv Detail & Related papers (2021-11-22T18:16:35Z) - UPDeT: Universal Multi-agent Reinforcement Learning via Policy
Decoupling with Transformers [108.92194081987967]
We make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks.
Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy.
The proposed model, named as Universal Policy Decoupling Transformer (UPDeT), further relaxes the action restriction and makes the multi-agent task's decision process more explainable.
arXiv Detail & Related papers (2021-01-20T07:24:24Z) - Deep Policy Networks for NPC Behaviors that Adapt to Changing Design
Parameters in Roguelike Games [137.86426963572214]
Turn-based strategy games like Roguelikes, for example, present unique challenges to Deep Reinforcement Learning (DRL)
We propose two network architectures to better handle complex categorical state spaces and to mitigate the need for retraining forced by design decisions.
arXiv Detail & Related papers (2020-12-07T08:47:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.