Hyperspace Neighbor Penetration Approach to Dynamic Programming for
Model-Based Reinforcement Learning Problems with Slowly Changing Variables in
A Continuous State Space
- URL: http://arxiv.org/abs/2106.05497v1
- Date: Thu, 10 Jun 2021 04:58:31 GMT
- Title: Hyperspace Neighbor Penetration Approach to Dynamic Programming for
Model-Based Reinforcement Learning Problems with Slowly Changing Variables in
A Continuous State Space
- Authors: Vincent Zha, Ivey Chiu, Alexandre Guilbault, and Jaime Tatis
- Abstract summary: We introduce a Hyperspace Neighbor Penetration (HNP) approach that solves the problem of handling slowly changing variables in reinforcement learning.
HNP captures in each transition step the state's partial "penetration" into its neighboring hyper-tiles in the gridded hyperspace.
In summary, HNP can be orders of magnitude more efficient than classical method in handling slowly changing variables in reinforcement learning.
- Score: 58.720142291102135
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Slowly changing variables in a continuous state space constitute an important
category of reinforcement learning and see its application in many domains,
such as modeling a climate control system where temperature, humidity, etc.
change slowly over time. However, this subject is less addressed in recent
studies. Classical methods with certain variants, such as Dynamic Programming
with Tile Coding which discretizes the state space, fail to handle slowly
changing variables because those methods cannot capture the tiny changes in
each transition step, as it is computationally expensive or impossible to
establish an extremely granular grid system. In this paper, we introduce a
Hyperspace Neighbor Penetration (HNP) approach that solves the problem. HNP
captures in each transition step the state's partial "penetration" into its
neighboring hyper-tiles in the gridded hyperspace, thus does not require the
transition to be inter-tile in order for the change to be captured. Therefore,
HNP allows for a very coarse grid system, which makes the computation feasible.
HNP assumes near linearity of the transition function in a local space, which
is commonly satisfied. In summary, HNP can be orders of magnitude more
efficient than classical method in handling slowly changing variables in
reinforcement learning. We have made an industrial implementation of NHP with a
great success.
Related papers
- Active search for Bifurcations [0.0]
We propose an active learning framework, where Bayesian Optimization is leveraged to discover saddle-node or Hopf bifurcations.
It provides a framework for uncertainty quantification in systems with inherentity.
It also provides a framework for uncertainty quantification in systems with resource-limited space exploration.
arXiv Detail & Related papers (2024-06-17T02:01:17Z) - SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes [59.23385953161328]
Novel view synthesis for dynamic scenes is still a challenging problem in computer vision and graphics.
We propose a new representation that explicitly decomposes the motion and appearance of dynamic scenes into sparse control points and dense Gaussians.
Our method can enable user-controlled motion editing while retaining high-fidelity appearances.
arXiv Detail & Related papers (2023-12-04T11:57:14Z) - Machine learning in and out of equilibrium [58.88325379746631]
Our study uses a Fokker-Planck approach, adapted from statistical physics, to explore these parallels.
We focus in particular on the stationary state of the system in the long-time limit, which in conventional SGD is out of equilibrium.
We propose a new variation of Langevin dynamics (SGLD) that harnesses without replacement minibatching.
arXiv Detail & Related papers (2023-06-06T09:12:49Z) - Numerical Methods for Convex Multistage Stochastic Optimization [86.45244607927732]
We focus on optimisation programming (SP), Optimal Control (SOC) and Decision Processes (MDP)
Recent progress in solving convex multistage Markov problems is based on cutting planes approximations of the cost-to-go functions of dynamic programming equations.
Cutting plane type methods can handle multistage problems with a large number of stages, but a relatively smaller number of state (decision) variables.
arXiv Detail & Related papers (2023-03-28T01:30:40Z) - Implicit Neural Spatial Representations for Time-dependent PDEs [29.404161110513616]
Implicit Neural Spatial Representation (INSR) has emerged as an effective representation of spatially-dependent vector fields.
This work explores solving time-dependent PDEs with INSR.
arXiv Detail & Related papers (2022-09-30T22:46:40Z) - Avoiding barren plateaus via transferability of smooth solutions in
Hamiltonian Variational Ansatz [0.0]
Variational Quantum Algorithms (VQAs) represent leading candidates to achieve computational speed-ups on current quantum devices.
Two major hurdles are the proliferation of low-quality variational local minima, and the exponential vanishing of gradients in the cost function landscape.
Here we show that by employing iterative search schemes one can effectively prepare the ground state of paradigmatic quantum many-body models.
arXiv Detail & Related papers (2022-06-04T12:52:29Z) - Error-Correcting Neural Networks for Semi-Lagrangian Advection in the
Level-Set Method [0.0]
We present a machine learning framework that blends image super-resolution technologies with scalar transport in the level-set method.
We investigate whether we can compute on-the-fly data-driven corrections to minimize numerical viscosity in the coarse-mesh evolution of an interface.
arXiv Detail & Related papers (2021-10-22T06:36:15Z) - DySMHO: Data-Driven Discovery of Governing Equations for Dynamical
Systems via Moving Horizon Optimization [77.34726150561087]
We introduce Discovery of Dynamical Systems via Moving Horizon Optimization (DySMHO), a scalable machine learning framework.
DySMHO sequentially learns the underlying governing equations from a large dictionary of basis functions.
Canonical nonlinear dynamical system examples are used to demonstrate that DySMHO can accurately recover the governing laws.
arXiv Detail & Related papers (2021-07-30T20:35:03Z) - GradInit: Learning to Initialize Neural Networks for Stable and
Efficient Training [59.160154997555956]
We present GradInit, an automated and architecture method for initializing neural networks.
It is based on a simple agnostic; the variance of each network layer is adjusted so that a single step of SGD or Adam results in the smallest possible loss value.
It also enables training the original Post-LN Transformer for machine translation without learning rate warmup.
arXiv Detail & Related papers (2021-02-16T11:45:35Z) - Exploring entanglement and optimization within the Hamiltonian
Variational Ansatz [0.4881924950569191]
We study a family of quantum circuits called the Hamiltonian Variational Ansatz (HVA)
HVA exhibits favorable structural properties such as mild or entirely absent barren plateaus and a restricted state space.
HVA can find accurate approximations to the ground states of a modified Haldane-Shastry Hamiltonian on a ring.
arXiv Detail & Related papers (2020-08-07T01:28:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.