Learning to reflect: A unifying approach for data-driven stochastic
control strategies
- URL: http://arxiv.org/abs/2104.11496v1
- Date: Fri, 23 Apr 2021 09:33:15 GMT
- Title: Learning to reflect: A unifying approach for data-driven stochastic
control strategies
- Authors: S\"oren Christensen, Claudia Strauch and Lukas Trottner
- Abstract summary: We show that developing efficient strategies for related singular control problems can essentially be reduced to finding rate-optimal estimators.
We exploit the exponential $beta$-mixing property as the common factor of both scenarios to drive the convergence analysis.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic optimal control problems have a long tradition in applied
probability, with the questions addressed being of high relevance in a
multitude of fields. Even though theoretical solutions are well understood in
many scenarios, their practicability suffers from the assumption of known
dynamics of the underlying stochastic process, raising the statistical
challenge of developing purely data-driven strategies. For the mathematically
separated classes of continuous diffusion processes and L\'evy processes, we
show that developing efficient strategies for related singular stochastic
control problems can essentially be reduced to finding rate-optimal estimators
with respect to the sup-norm risk of objects associated to the invariant
distribution of ergodic processes which determine the theoretical solution of
the control problem. From a statistical perspective, we exploit the exponential
$\beta$-mixing property as the common factor of both scenarios to drive the
convergence analysis, indicating that relying on general stability properties
of Markov processes is a sufficiently powerful and flexible approach to treat
complex applications requiring statistical methods. We show moreover that in
the L\'evy case $-$ even though per se jump processes are more difficult to
handle both in statistics and control theory $-$ a fully data-driven strategy
with regret of significantly better order than in the diffusion case can be
constructed.
Related papers
- Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems [61.580419063416734]
A recent stream of structured learning approaches has improved the practical state of the art for a range of optimization problems.
The key idea is to exploit the statistical distribution over instances instead of dealing with instances separately.
In this article, we investigate methods that smooth the risk by perturbing the policy, which eases optimization and improves the generalization error.
arXiv Detail & Related papers (2024-07-24T12:00:30Z) - Borrowing Strength in Distributionally Robust Optimization via Hierarchical Dirichlet Processes [35.53901341372684]
Our approach unifies regularized estimation, distributionally robust optimization, and hierarchical Bayesian modeling.
By employing a hierarchical Dirichlet process (HDP) prior, the method effectively handles multi-source data.
Numerical experiments validate the framework's efficacy in improving and stabilizing both prediction and parameter estimation accuracy.
arXiv Detail & Related papers (2024-05-21T19:03:09Z) - Stochastic Q-learning for Large Discrete Action Spaces [79.1700188160944]
In complex environments with discrete action spaces, effective decision-making is critical in reinforcement learning (RL)
We present value-based RL approaches which, as opposed to optimizing over the entire set of $n$ actions, only consider a variable set of actions, possibly as small as $mathcalO(log(n)$)$.
The presented value-based RL methods include, among others, Q-learning, StochDQN, StochDDQN, all of which integrate this approach for both value-function updates and action selection.
arXiv Detail & Related papers (2024-05-16T17:58:44Z) - Variational Annealing on Graphs for Combinatorial Optimization [7.378582040635655]
We show that an autoregressive approach which captures statistical dependencies among solution variables yields superior performance on many popular CO problems.
We introduce subgraph tokenization in which the configuration of a set of solution variables is represented by a single token.
arXiv Detail & Related papers (2023-11-23T18:56:51Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Statistical optimality and stability of tangent transform algorithms in
logit models [6.9827388859232045]
We provide conditions on the data generating process to derive non-asymptotic upper bounds to the risk incurred by the logistical optima.
In particular, we establish local variation of the algorithm without any assumptions on the data-generating process.
We explore a special case involving a semi-orthogonal design under which a global convergence is obtained.
arXiv Detail & Related papers (2020-10-25T05:15:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.