Human-Level Reinforcement Learning through Theory-Based Modeling,
Exploration, and Planning
- URL: http://arxiv.org/abs/2107.12544v1
- Date: Tue, 27 Jul 2021 01:38:13 GMT
- Title: Human-Level Reinforcement Learning through Theory-Based Modeling,
Exploration, and Planning
- Authors: Pedro A. Tsividis, Joao Loula, Jake Burga, Nathan Foss, Andres
Campero, Thomas Pouncy, Samuel J. Gershman, Joshua B. Tenenbaum
- Abstract summary: Theory-Based Reinforcement Learning uses human-like intuitive theories to explore and model an environment.
We instantiate the approach in a video game playing agent called EMPA.
EMPA matches human learning efficiency on a suite of 90 Atari-style video games.
- Score: 27.593497502386143
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) studies how an agent comes to achieve reward in
an environment through interactions over time. Recent advances in machine RL
have surpassed human expertise at the world's oldest board games and many
classic video games, but they require vast quantities of experience to learn
successfully -- none of today's algorithms account for the human ability to
learn so many different tasks, so quickly. Here we propose a new approach to
this challenge based on a particularly strong form of model-based RL which we
call Theory-Based Reinforcement Learning, because it uses human-like intuitive
theories -- rich, abstract, causal models of physical objects, intentional
agents, and their interactions -- to explore and model an environment, and plan
effectively to achieve task goals. We instantiate the approach in a video game
playing agent called EMPA (the Exploring, Modeling, and Planning Agent), which
performs Bayesian inference to learn probabilistic generative models expressed
as programs for a game-engine simulator, and runs internal simulations over
these models to support efficient object-based, relational exploration and
heuristic planning. EMPA closely matches human learning efficiency on a suite
of 90 challenging Atari-style video games, learning new games in just minutes
of game play and generalizing robustly to new game situations and new levels.
The model also captures fine-grained structure in people's exploration
trajectories and learning dynamics. Its design and behavior suggest a way
forward for building more general human-like AI systems.
Related papers
- RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation [68.70755196744533]
RoboGen is a generative robotic agent that automatically learns diverse robotic skills at scale via generative simulation.
Our work attempts to extract the extensive and versatile knowledge embedded in large-scale models and transfer them to the field of robotics.
arXiv Detail & Related papers (2023-11-02T17:59:21Z) - Co-Learning Empirical Games and World Models [23.800790782022222]
Empirical games drive world models toward a broader consideration of possible game dynamics.
World models guide empirical games to efficiently discover new strategies through planning.
A new algorithm, Dyna-PSRO, co-learns an empirical game and a world model.
arXiv Detail & Related papers (2023-05-23T16:37:21Z) - Adaptive action supervision in reinforcement learning from real-world
multi-agent demonstrations [10.174009792409928]
We propose a method for adaptive action supervision in RL from real-world demonstrations in multi-agent scenarios.
In the experiments, using chase-and-escape and football tasks with the different dynamics between the unknown source and target environments, we show that our approach achieved a balance between the generalization and the generalization ability compared with the baselines.
arXiv Detail & Related papers (2023-05-22T13:33:37Z) - Predictive Experience Replay for Continual Visual Control and
Forecasting [62.06183102362871]
We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting.
Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
arXiv Detail & Related papers (2023-03-12T05:08:03Z) - Curious Exploration via Structured World Models Yields Zero-Shot Object
Manipulation [19.840186443344]
We propose to use structured world models to incorporate inductive biases in the control loop to achieve sample-efficient exploration.
Our method generates free-play behavior that starts to interact with objects early on and develops more complex behavior over time.
arXiv Detail & Related papers (2022-06-22T22:08:50Z) - Autonomous Reinforcement Learning: Formalism and Benchmarking [106.25788536376007]
Real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world.
Common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts.
This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms.
arXiv Detail & Related papers (2021-12-17T16:28:06Z) - Architecting and Visualizing Deep Reinforcement Learning Models [77.34726150561087]
Deep Reinforcement Learning (DRL) is a theory that aims to teach computers how to communicate with each other.
In this paper, we present a new Atari Pong game environment, a policy gradient based DRL model, a real-time network visualization, and an interactive display to help build intuition and awareness of the mechanics of DRL inference.
arXiv Detail & Related papers (2021-12-02T17:48:26Z) - Independent Learning in Stochastic Games [16.505046191280634]
We present the model of games for multi-agent learning in dynamic environments.
We focus on the development of simple and independent learning dynamics for games.
We present our recently proposed simple and independent learning dynamics that guarantee convergence in zero-sum games.
arXiv Detail & Related papers (2021-11-23T09:27:20Z) - Mastering Atari with Discrete World Models [61.7688353335468]
We introduce DreamerV2, a reinforcement learning agent that learns behaviors purely from predictions in the compact latent space of a powerful world model.
DreamerV2 constitutes the first agent that achieves human-level performance on the Atari benchmark of 55 tasks by learning behaviors inside a separately trained world model.
arXiv Detail & Related papers (2020-10-05T17:52:14Z) - Model-Based Reinforcement Learning for Atari [89.3039240303797]
We show how video prediction models can enable agents to solve Atari games with fewer interactions than model-free methods.
Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment.
arXiv Detail & Related papers (2019-03-01T15:40:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.