Mastering Diverse Domains through World Models
- URL: http://arxiv.org/abs/2301.04104v2
- Date: Wed, 17 Apr 2024 17:41:20 GMT
- Title: Mastering Diverse Domains through World Models
- Authors: Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap,
- Abstract summary: We present DreamerV3, a general algorithm that outperforms specialized methods across over 150 diverse tasks, with a single configuration.
Dreamer is the first algorithm to collect diamonds in Minecraft from scratch without human data or curricula.
- Score: 43.382115013586535
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Developing a general algorithm that learns to solve tasks across a wide range of applications has been a fundamental challenge in artificial intelligence. Although current reinforcement learning algorithms can be readily applied to tasks similar to what they have been developed for, configuring them for new application domains requires significant human expertise and experimentation. We present DreamerV3, a general algorithm that outperforms specialized methods across over 150 diverse tasks, with a single configuration. Dreamer learns a model of the environment and improves its behavior by imagining future scenarios. Robustness techniques based on normalization, balancing, and transformations enable stable learning across domains. Applied out of the box, Dreamer is the first algorithm to collect diamonds in Minecraft from scratch without human data or curricula. This achievement has been posed as a significant challenge in artificial intelligence that requires exploring farsighted strategies from pixels and sparse rewards in an open world. Our work allows solving challenging control problems without extensive experimentation, making reinforcement learning broadly applicable.
Related papers
- Neural networks for abstraction and reasoning: Towards broad
generalization in machines [3.165509887826658]
We look at novel approaches for solving the Abstraction & Reasoning Corpus (ARC)
We adapt the DreamCoder neurosymbolic reasoning solver to ARC.
We present the Perceptual Abstraction and Reasoning Language (PeARL) language, which allows DreamCoder to solve ARC tasks.
We publish the arckit Python library to make future research on ARC easier.
arXiv Detail & Related papers (2024-02-05T20:48:57Z) - General Intelligence Requires Rethinking Exploration [24.980249597326985]
We argue that exploration is essential to all learning systems, including supervised learning.
Generalized exploration serves as a necessary objective for maintaining open-ended learning processes.
arXiv Detail & Related papers (2022-11-15T00:46:15Z) - Deep Hierarchical Planning from Pixels [86.14687388689204]
Director is a method for learning hierarchical behaviors directly from pixels by planning inside the latent space of a learned world model.
Despite operating in latent space, the decisions are interpretable because the world model can decode goals into images for visualization.
Director also learns successful behaviors across a wide range of environments, including visual control, Atari games, and DMLab levels.
arXiv Detail & Related papers (2022-06-08T18:20:15Z) - Divide & Conquer Imitation Learning [75.31752559017978]
Imitation Learning can be a powerful approach to bootstrap the learning process.
We present a novel algorithm designed to imitate complex robotic tasks from the states of an expert trajectory.
We show that our method imitates a non-holonomic navigation task and scales to a complex simulated robotic manipulation task with very high sample efficiency.
arXiv Detail & Related papers (2022-04-15T09:56:50Z) - Maximum Entropy Model-based Reinforcement Learning [0.0]
This work connects exploration techniques and model-based reinforcement learning.
We have designed a novel exploration method that takes into account features of the model-based approach.
We also demonstrate through experiments that our method significantly improves the performance of the model-based algorithm Dreamer.
arXiv Detail & Related papers (2021-12-02T13:07:29Z) - WenLan 2.0: Make AI Imagine via a Multimodal Foundation Model [74.4875156387271]
We develop a novel foundation model pre-trained with huge multimodal (visual and textual) data.
We show that state-of-the-art results can be obtained on a wide range of downstream tasks.
arXiv Detail & Related papers (2021-10-27T12:25:21Z) - Continual Learning of Control Primitives: Skill Discovery via
Reset-Games [128.36174682118488]
We show how a single method can allow an agent to acquire skills with minimal supervision.
We do this by exploiting the insight that the need to "reset" an agent to a broad set of initial states for a learning task provides a natural setting to learn a diverse set of "reset-skills"
arXiv Detail & Related papers (2020-11-10T18:07:44Z) - The Ingredients of Real-World Robotic Reinforcement Learning [71.92831985295163]
We discuss the elements that are needed for a robotic learning system that can continually and autonomously improve with data collected in the real world.
We propose a particular instantiation of such a system, using dexterous manipulation as our case study.
We demonstrate that our complete system can learn without any human intervention, acquiring a variety of vision-based skills with a real-world three-fingered hand.
arXiv Detail & Related papers (2020-04-27T03:36:10Z) - Learning as Reinforcement: Applying Principles of Neuroscience for More
General Reinforcement Learning Agents [1.0742675209112622]
We implement an architecture founded in principles of experimental neuroscience, by combining computationally efficient abstractions of biological algorithms.
Our approach is inspired by research on spike-timing dependent plasticity, the transition between short and long term memory, and the role of various neurotransmitters in rewarding curiosity.
The Neurons-in-a-Box architecture can learn in a wholly generalizable manner, and demonstrates an efficient way to build and apply representations without explicitly optimizing over a set of criteria or actions.
arXiv Detail & Related papers (2020-04-20T04:06:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.