Octax: Accelerated CHIP-8 Arcade Environments for Reinforcement Learning in JAX
- URL: http://arxiv.org/abs/2510.01764v2
- Date: Fri, 03 Oct 2025 06:51:40 GMT
- Title: Octax: Accelerated CHIP-8 Arcade Environments for Reinforcement Learning in JAX
- Authors: Waris Radji, Thomas Michel, Hector Piteau,
- Abstract summary: Reinforcement learning (RL) research requires diverse, challenging environments that are both tractable and scalable.<n>We introduce Octax, a high-performance suite of classic arcade game environments implemented in JAX.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning (RL) research requires diverse, challenging environments that are both tractable and scalable. While modern video games may offer rich dynamics, they are computationally expensive and poorly suited for large-scale experimentation due to their CPU-bound execution. We introduce Octax, a high-performance suite of classic arcade game environments implemented in JAX, based on CHIP-8 emulation, a predecessor to Atari, which is widely adopted as a benchmark in RL research. Octax provides the JAX community with a long-awaited end-to-end GPU alternative to the Atari benchmark, offering image-based environments, spanning puzzle, action, and strategy genres, all executable at massive scale on modern GPUs. Our JAX-based implementation achieves orders-of-magnitude speedups over traditional CPU emulators while maintaining perfect fidelity to the original game mechanics. We demonstrate Octax's capabilities by training RL agents across multiple games, showing significant improvements in training speed and scalability compared to existing solutions. The environment's modular design enables researchers to easily extend the suite with new games or generate novel environments using large language models, making it an ideal platform for large-scale RL experimentation.
Related papers
- Ludax: A GPU-Accelerated Domain Specific Language for Board Games [44.45953630612019]
Ludax is a domain-specific language for board games which automatically compiles into hardware-accelerated code.<n>We envision Ludax as a tool to help accelerate games research generally, from RL to cognitive science.
arXiv Detail & Related papers (2025-06-27T20:15:53Z) - NAVIX: Scaling MiniGrid Environments with JAX [17.944645332888335]
We introduce NAVIX, a re-implementation of MiniGrid in JAX.
NAVIX achieves over 200 000x speed improvements in batch mode, supporting up to 2048 agents in parallel on a single Nvidia A100 80 GB.
This reduces experiment times from one week to 15 minutes, promoting faster design and more scalable RL model development.
arXiv Detail & Related papers (2024-07-28T04:39:18Z) - Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning [4.067733179628694]
Craftax is a ground-up rewrite of Crafter in JAX that runs up to 250x faster than the Python-native original.
A run of PPO using 1 billion environment interactions finishes in under an hour using only a single GPU.
We show that existing methods including global and episodic exploration, as well as unsupervised environment design fail to make material progress on the benchmark.
arXiv Detail & Related papers (2024-02-26T18:19:07Z) - JaxMARL: Multi-Agent RL Environments and Algorithms in JAX [105.343918678781]
We present JaxMARL, the first open-source, Python-based library that combines GPU-enabled efficiency with support for a large number of commonly used MARL environments.
Our experiments show that, in terms of wall clock time, our JAX-based training pipeline is around 14 times faster than existing approaches.
We also introduce and benchmark SMAX, a JAX-based approximate reimplementation of the popular StarCraft Multi-Agent Challenge.
arXiv Detail & Related papers (2023-11-16T18:58:43Z) - Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement
Learning [0.6670498055582528]
Pgx is a suite of board game reinforcement learning (RL) environments written in JAX and optimized for GPU/TPU accelerators.
Pgx can simulate RL environments 10-100x faster than existing implementations available in Python.
Pgx includes RL environments commonly used as benchmarks in RL research, such as backgammon, chess, shogi, and Go.
arXiv Detail & Related papers (2023-03-29T02:41:23Z) - Cramming: Training a Language Model on a Single GPU in One Day [64.18297923419627]
Recent trends in language modeling have focused on increasing performance through scaling.
We investigate the downstream performance achievable with a transformer-based language model trained completely from scratch with masked language modeling for a single day on a single consumer GPU.
We provide evidence that even in this constrained setting, performance closely follows scaling laws observed in large-compute settings.
arXiv Detail & Related papers (2022-12-28T18:59:28Z) - EnvPool: A Highly Parallel Reinforcement Learning Environment Execution
Engine [69.47822647770542]
parallel environment execution is often the slowest part of the whole system but receives little attention.
With a curated design for paralleling RL environments, we have improved the RL environment simulation speed across different hardware setups.
On a high-end machine, EnvPool achieves 1 million frames per second for the environment execution on Atari environments and 3 million frames per second on MuJoCo environments.
arXiv Detail & Related papers (2022-06-21T17:36:15Z) - ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep
Reinforcement Learning [141.58588761593955]
We present a library ElegantRL-podracer for cloud-native deep reinforcement learning.
It efficiently supports millions of cores to carry out massively parallel training at multiple levels.
At a low-level, each pod simulates agent-environment interactions in parallel by fully utilizing nearly 7,000 GPU cores in a single GPU.
arXiv Detail & Related papers (2021-12-11T06:31:21Z) - The NetHack Learning Environment [79.06395964379107]
We present the NetHack Learning Environment (NLE), a procedurally generated rogue-like environment for Reinforcement Learning research.
We argue that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL.
We demonstrate empirical success for early stages of the game using a distributed Deep RL baseline and Random Network Distillation exploration.
arXiv Detail & Related papers (2020-06-24T14:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.