Diagnosing and exploiting the computational demands of videos games for
deep reinforcement learning
- URL: http://arxiv.org/abs/2309.13181v1
- Date: Fri, 22 Sep 2023 21:03:33 GMT
- Title: Diagnosing and exploiting the computational demands of videos games for
deep reinforcement learning
- Authors: Lakshmi Narasimhan Govindarajan, Rex G Liu, Drew Linsley, Alekh
Karkada Ashok, Max Reuter, Michael J Frank, Thomas Serre
- Abstract summary: We introduce the Learning Challenge Diagnosticator (LCD), a tool that measures the perceptual and reinforcement learning demands of a task.
We use LCD to discover a novel taxonomy of challenges in the Procgen benchmark, and demonstrate that these predictions are both highly reliable and can instruct algorithmic development.
- Score: 13.98405611352641
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humans learn by interacting with their environments and perceiving the
outcomes of their actions. A landmark in artificial intelligence has been the
development of deep reinforcement learning (dRL) algorithms capable of doing
the same in video games, on par with or better than humans. However, it remains
unclear whether the successes of dRL models reflect advances in visual
representation learning, the effectiveness of reinforcement learning algorithms
at discovering better policies, or both. To address this question, we introduce
the Learning Challenge Diagnosticator (LCD), a tool that separately measures
the perceptual and reinforcement learning demands of a task. We use LCD to
discover a novel taxonomy of challenges in the Procgen benchmark, and
demonstrate that these predictions are both highly reliable and can instruct
algorithmic development. More broadly, the LCD reveals multiple failure cases
that can occur when optimizing dRL algorithms over entire video game benchmarks
like Procgen, and provides a pathway towards more efficient progress.
Related papers
- Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning [47.785786984974855]
We present a human-in-the-loop vision-based RL system that demonstrates impressive performance on a diverse set of dexterous manipulation tasks.
Our approach integrates demonstrations and human corrections, efficient RL algorithms, and other system-level design choices to learn policies.
We show that our method significantly outperforms imitation learning baselines and prior RL approaches, with an average 2x improvement in success rate and 1.8x faster execution.
arXiv Detail & Related papers (2024-10-29T08:12:20Z) - M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation [0.7564784873669823]
We propose Multimodal Contrastive Unsupervised Reinforcement Learning (M2CURL)
Our approach employs a novel multimodal self-supervised learning technique that learns efficient representations and contributes to faster convergence of RL algorithms.
We evaluate M2CURL on the Tactile Gym 2 simulator and we show that it significantly enhances the learning efficiency in different manipulation tasks.
arXiv Detail & Related papers (2024-01-30T14:09:35Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Implicit Offline Reinforcement Learning via Supervised Learning [83.8241505499762]
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels.
We show how implicit models can leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets.
arXiv Detail & Related papers (2022-10-21T21:59:42Z) - Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning [92.18524491615548]
Contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL)
We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions.
Under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
arXiv Detail & Related papers (2022-07-29T17:29:08Z) - Deep Apprenticeship Learning for Playing Games [0.0]
We explore the feasibility of designing a learning model based on expert behaviour for complex, multidimensional tasks.
We propose a novel method for apprenticeship learning based on the previous research on supervised learning techniques in reinforcement learning.
Our method is applied to video frames from Atari games in order to teach an artificial agent to play those games.
arXiv Detail & Related papers (2022-05-16T19:52:45Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Machine versus Human Attention in Deep Reinforcement Learning Tasks [38.80270891345248]
We shed light on the inner workings of such trained models by analyzing the pixels that they attend to during task execution.
We compare the saliency maps of RL agents against visual attention models of human experts when learning to play Atari games.
arXiv Detail & Related papers (2020-10-29T20:58:45Z) - Forgetful Experience Replay in Hierarchical Reinforcement Learning from
Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments.
Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations.
The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z) - Algorithms in Multi-Agent Systems: A Holistic Perspective from
Reinforcement Learning and Game Theory [2.5147566619221515]
Deep reinforcement learning has achieved outstanding results in recent years.
Recent works are exploring learning beyond single-agent scenarios and considering multi-agent scenarios.
Traditional game-theoretic algorithms, which, in turn, show bright application promise combined with modern algorithms and boosting computing power.
arXiv Detail & Related papers (2020-01-17T15:08:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.