Online Planning in POMDPs with Self-Improving Simulators
- URL: http://arxiv.org/abs/2201.11404v1
- Date: Thu, 27 Jan 2022 09:41:59 GMT
- Title: Online Planning in POMDPs with Self-Improving Simulators
- Authors: Jinke He, Miguel Suau, Hendrik Baier, Michael Kaisers, Frans A.
Oliehoek
- Abstract summary: We learn online an approximate but much faster simulator that improves over time.
To plan reliably and efficiently while the approximate simulator is learning, we develop a method that adaptively decides which simulator to use for every simulation.
Experimental results in two large domains show that when integrated with POMCP, our approach allows to plan with improving efficiency over time.
- Score: 17.722070992253638
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How can we plan efficiently in a large and complex environment when the time
budget is limited? Given the original simulator of the environment, which may
be computationally very demanding, we propose to learn online an approximate
but much faster simulator that improves over time. To plan reliably and
efficiently while the approximate simulator is learning, we develop a method
that adaptively decides which simulator to use for every simulation, based on a
statistic that measures the accuracy of the approximate simulator. This allows
us to use the approximate simulator to replace the original simulator for
faster simulations when it is accurate enough under the current context, thus
trading off simulation speed and accuracy. Experimental results in two large
domains show that when integrated with POMCP, our approach allows to plan with
improving efficiency over time.
Related papers
- Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous
Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes.
It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training.
We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z) - Continual learning autoencoder training for a particle-in-cell
simulation via streaming [52.77024349608834]
upcoming exascale era will provide a new generation of physics simulations with high resolution.
These simulations will have a high resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible.
This work presents an approach that trains a neural network concurrently to a running simulation without data on a disk.
arXiv Detail & Related papers (2022-11-09T09:55:14Z) - DiSECt: A Differentiable Simulator for Parameter Inference and Control
in Robotic Cutting [71.50844437057555]
We present DiSECt: the first differentiable simulator for cutting soft materials.
The simulator augments the finite element method with a continuous contact model based on signed distance fields.
We show that the simulator can be calibrated to match resultant forces and fields from a state-of-the-art commercial solver.
arXiv Detail & Related papers (2022-03-19T07:27:19Z) - Robot Learning from Randomized Simulations: A Review [59.992761565399185]
Deep learning has caused a paradigm shift in robotics research, favoring methods that require large amounts of data.
State-of-the-art approaches learn in simulation where data generation is fast as well as inexpensive.
We focus on a technique named 'domain randomization' which is a method for learning from randomized simulations.
arXiv Detail & Related papers (2021-11-01T13:55:41Z) - SimNet: Computer Architecture Simulation using Machine Learning [3.7019798164954336]
This work describes a concerted effort, where machine learning (ML) is used to accelerate discrete-event simulation.
A GPU-accelerated parallel simulator is implemented based on the proposed instruction latency predictor.
Its simulation accuracy and throughput are validated and evaluated against a state-of-the-art simulator.
arXiv Detail & Related papers (2021-05-12T17:31:52Z) - Simulation-efficient marginal posterior estimation with swyft: stop
wasting your precious time [5.533353383316288]
We present algorithms for nested neural likelihood-to-evidence ratio estimation and simulation reuse.
Together, these algorithms enable automatic and extremely simulator efficient estimation of marginal and joint posteriors.
arXiv Detail & Related papers (2020-11-27T19:00:07Z) - Influence-Augmented Online Planning for Complex Environments [13.7920323975611]
We propose influence-augmented online planning, a principled method to transform a factored simulator of the entire environment into a local simulator.
Our main experimental results show that planning on this less accurate but much faster local simulator with POMCP leads to higher real-time planning performance.
arXiv Detail & Related papers (2020-10-21T14:39:26Z) - AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective.
We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z) - Building high accuracy emulators for scientific simulations with deep
neural architecture search [0.0]
A promising route to accelerate simulations by building fast emulators with machine learning requires large training datasets.
Here we present a method based on neural architecture search to build accurate emulators even with a limited number of training data.
The method successfully accelerates simulations by up to 2 billion times in 10 scientific cases including astrophysics, climate science, biogeochemistry, high energy density physics, fusion energy, and seismology.
arXiv Detail & Related papers (2020-01-17T22:14:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.