From Optimization to Control: Quasi Policy Iteration
- URL: http://arxiv.org/abs/2311.11166v1
- Date: Sat, 18 Nov 2023 21:00:14 GMT
- Title: From Optimization to Control: Quasi Policy Iteration
- Authors: Mohammad Amin Sharifi Kolarijani and Peyman Mohajerin Esfahani
- Abstract summary: Recent control algorithms for Markov decision processes (MDPs) have been designed using an implicit analogy with well-established optimization algorithms.
In this paper, we make this analogy explicit across four problem classes with a unified solution characterization.
We identify equivalent optimization and control algorithms that have already been pointed out in the existing literature, but mostly in a scattered way.
- Score: 4.061135251278187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent control algorithms for Markov decision processes (MDPs) have been
designed using an implicit analogy with well-established optimization
algorithms. In this paper, we make this analogy explicit across four problem
classes with a unified solution characterization. This novel framework, in
turn, allows for a systematic transformation of algorithms from one domain to
the other. In particular, we identify equivalent optimization and control
algorithms that have already been pointed out in the existing literature, but
mostly in a scattered way. With this unifying framework in mind, we then
exploit two linear structural constraints specific to MDPs for approximating
the Hessian in a second-order-type algorithm from optimization, namely,
Anderson mixing. This leads to a novel first-order control algorithm that
modifies the standard value iteration (VI) algorithm by incorporating two new
directions and adaptive step sizes. While the proposed algorithm, coined as
quasi-policy iteration, has the same computational complexity as VI, it
interestingly exhibits an empirical convergence behavior similar to policy
iteration with a very low sensitivity to the discount factor.
Related papers
- Deterministic Trajectory Optimization through Probabilistic Optimal Control [3.2771631221674333]
We propose two new algorithms for discrete-time deterministic finite-horizon nonlinear optimal control problems.
Both algorithms are inspired by a novel theoretical paradigm known as probabilistic optimal control.
We show that the application of this algorithm results in a fixed point of probabilistic policies that converge to the deterministic optimal policy.
arXiv Detail & Related papers (2024-07-18T09:17:47Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Stochastic Ratios Tracking Algorithm for Large Scale Machine Learning
Problems [0.7614628596146599]
We propose a novel algorithm for adaptive step length selection in the classical SGD framework.
Under reasonable conditions, the algorithm produces step lengths in line with well-established theoretical requirements.
We show that the algorithm can generate step lengths comparable to the best step length obtained from manual tuning.
arXiv Detail & Related papers (2023-05-17T06:22:11Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Adaptive First- and Second-Order Algorithms for Large-Scale Machine
Learning [3.0204520109309843]
We consider first- and second-order techniques to address continuous optimization problems in machine learning.
In the first-order case, we propose a framework of transition from semi-deterministic to quadratic regularization methods.
In the second-order case, we propose a novel first-order algorithm with adaptive sampling and adaptive step size.
arXiv Detail & Related papers (2021-11-29T18:10:00Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex
Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network.
We design two optimal algorithms that attain these lower bounds.
We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z) - Optimal and Practical Algorithms for Smooth and Strongly Convex
Decentralized Optimization [21.555331273873175]
We consider the task of decentralized minimization of the sum of smooth strongly convex functions stored across the nodes of a network.
We propose two new algorithms for this decentralized optimization problem and equip them with complexity guarantees.
arXiv Detail & Related papers (2020-06-21T11:23:20Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Extreme Algorithm Selection With Dyadic Feature Representation [78.13985819417974]
We propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms.
We assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation.
arXiv Detail & Related papers (2020-01-29T09:40:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.