Hieros: Hierarchical Imagination on Structured State Space Sequence
World Models
- URL: http://arxiv.org/abs/2310.05167v3
- Date: Sun, 18 Feb 2024 13:42:53 GMT
- Title: Hieros: Hierarchical Imagination on Structured State Space Sequence
World Models
- Authors: Paul Mattes, Rainer Schlosser, Ralf Herbrich
- Abstract summary: Hieros is a hierarchical policy that learns time abstracted world representations and imagines trajectories at multiple time scales in latent space.
We show that our approach outperforms the state of the art in terms of mean and median normalized human score on the Atari 100k benchmark.
- Score: 4.922995343278039
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the biggest challenges to modern deep reinforcement learning (DRL)
algorithms is sample efficiency. Many approaches learn a world model in order
to train an agent entirely in imagination, eliminating the need for direct
environment interaction during training. However, these methods often suffer
from either a lack of imagination accuracy, exploration capabilities, or
runtime efficiency. We propose Hieros, a hierarchical policy that learns time
abstracted world representations and imagines trajectories at multiple time
scales in latent space. Hieros uses an S5 layer-based world model, which
predicts next world states in parallel during training and iteratively during
environment interaction. Due to the special properties of S5 layers, our method
can train in parallel and predict next world states iteratively during
imagination. This allows for more efficient training than RNN-based world
models and more efficient imagination than Transformer-based world models.
We show that our approach outperforms the state of the art in terms of mean
and median normalized human score on the Atari 100k benchmark, and that our
proposed world model is able to predict complex dynamics very accurately. We
also show that Hieros displays superior exploration capabilities compared to
existing approaches.
Related papers
- Open-World Reinforcement Learning over Long Short-Term Imagination [91.28998327423295]
We present LS-Imagine, which extends the imagination horizon within a limited number of state transition steps.
Our method demonstrates significant improvements over state-of-the-art techniques in MineDojo.
arXiv Detail & Related papers (2024-10-04T17:17:30Z) - Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models [60.87795376541144]
A world model is a neural network capable of predicting an agent's next state given past states and actions.
During end-to-end training, our policy learns how to recover from errors by aligning with states observed in human demonstrations.
We present qualitative and quantitative results, demonstrating significant improvements upon prior state of the art in closed-loop testing.
arXiv Detail & Related papers (2024-09-25T06:48:25Z) - One-shot World Models Using a Transformer Trained on a Synthetic Prior [37.027893127637036]
One-Shot World Model (OSWM) is a transformer world model that is learned in an in-context learning fashion from purely synthetic data.
OSWM is able to quickly adapt to the dynamics of a simple grid world, as well as the CartPole gym and a custom control environment.
arXiv Detail & Related papers (2024-09-21T09:39:32Z) - Learning Interactive Real-World Simulators [96.5991333400566]
We explore the possibility of learning a universal simulator of real-world interaction through generative modeling.
We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies.
Video captioning models can benefit from training with simulated experience, opening up even wider applications.
arXiv Detail & Related papers (2023-10-09T19:42:22Z) - Self-supervised novel 2D view synthesis of large-scale scenes with
efficient multi-scale voxel carving [77.07589573960436]
We introduce an efficient multi-scale voxel carving method to generate novel views of real scenes.
Our final high-resolution output is efficiently self-trained on data automatically generated by the voxel carving module.
We demonstrate the effectiveness of our method on highly complex and large-scale scenes in real environments.
arXiv Detail & Related papers (2023-06-26T13:57:05Z) - Adaptive action supervision in reinforcement learning from real-world
multi-agent demonstrations [10.174009792409928]
We propose a method for adaptive action supervision in RL from real-world demonstrations in multi-agent scenarios.
In the experiments, using chase-and-escape and football tasks with the different dynamics between the unknown source and target environments, we show that our approach achieved a balance between the generalization and the generalization ability compared with the baselines.
arXiv Detail & Related papers (2023-05-22T13:33:37Z) - Predictive World Models from Real-World Partial Observations [66.80340484148931]
We present a framework for learning a probabilistic predictive world model for real-world road environments.
While prior methods require complete states as ground truth for learning, we present a novel sequential training method to allow HVAEs to learn to predict complete states from partially observed states only.
arXiv Detail & Related papers (2023-01-12T02:07:26Z) - Transformers are Sample Efficient World Models [1.9444242128493845]
We introduce IRIS, a data-efficient agent that learns in a world model composed of a discrete autoencoder and an autoregressive Transformer.
With the equivalent of only two hours of gameplay in the Atari 100k benchmark, IRIS achieves a mean human normalized score of 1.046, and outperforms humans on 10 out of 26 games.
arXiv Detail & Related papers (2022-09-01T17:03:07Z) - Cycle-Consistent World Models for Domain Independent Latent Imagination [0.0]
High costs and risks make it hard to train autonomous cars in the real world.
We propose a novel model-based reinforcement learning approach called Cycleconsistent World Models.
arXiv Detail & Related papers (2021-10-02T13:55:50Z) - Mastering Atari with Discrete World Models [61.7688353335468]
We introduce DreamerV2, a reinforcement learning agent that learns behaviors purely from predictions in the compact latent space of a powerful world model.
DreamerV2 constitutes the first agent that achieves human-level performance on the Atari benchmark of 55 tasks by learning behaviors inside a separately trained world model.
arXiv Detail & Related papers (2020-10-05T17:52:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.