Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process
- URL: http://arxiv.org/abs/2110.10351v1
- Date: Wed, 20 Oct 2021 02:57:21 GMT
- Title: Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process
- Authors: Tianjiao Li, Ziwei Guan, Shaofeng Zou, Tengyu Xu, Yingbin Liang and
Guanghui Lan
- Abstract summary: The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
- Score: 56.55075925645864
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The problem of constrained Markov decision process (CMDP) is investigated,
where an agent aims to maximize the expected accumulated discounted reward
subject to multiple constraints on its utilities/costs. A new primal-dual
approach is proposed with a novel integration of three ingredients: entropy
regularized policy optimizer, dual variable regularizer, and Nesterov's
accelerated gradient descent dual optimizer, all of which are critical to
achieve a faster convergence. The finite-time error bound of the proposed
approach is characterized. Despite the challenge of the nonconcave objective
subject to nonconcave constraints, the proposed approach is shown to converge
to the global optimum with a complexity of $\tilde{\mathcal O}(1/\epsilon)$ in
terms of the optimality gap and the constraint violation, which improves the
complexity of the existing primal-dual approach by a factor of $\mathcal
O(1/\epsilon)$ \citep{ding2020natural,paternain2019constrained}. This is the
first demonstration that nonconcave CMDP problems can attain the complexity
lower bound of $\mathcal O(1/\epsilon)$ for convex optimization subject to
convex constraints. Our primal-dual approach and non-asymptotic analysis are
agnostic to the RL optimizer used, and thus are more flexible for practical
applications. More generally, our approach also serves as the first algorithm
that provably accelerates constrained nonconvex optimization with zero duality
gap by exploiting the geometries such as the gradient dominance condition, for
which the existing acceleration methods for constrained convex optimization are
not applicable.
Related papers
- Double Duality: Variational Primal-Dual Policy Optimization for
Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure.
Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z) - Low-Rank Extragradient Methods for Scalable Semidefinite Optimization [0.0]
We focus on high-dimensional and plausible settings in which the problem admits a low-rank solution.
We provide several theoretical results proving that, under these circumstances, the well-known Extragradient method converges to a solution of the constrained optimization problem.
arXiv Detail & Related papers (2024-02-14T10:48:00Z) - Online Non-convex Optimization with Long-term Non-convex Constraints [2.033434950296318]
A novel Follow-the-Perturbed-Leader type algorithm is proposed and analyzed for solving general long-term constrained optimization problems in online manner.
The proposed algorithm is applied to tackle a long-term (extreme value) constrained river pollutant source identification problem.
arXiv Detail & Related papers (2023-11-04T15:08:36Z) - Algorithm for Constrained Markov Decision Process with Linear
Convergence [55.41644538483948]
An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs.
A new dual approach is proposed with the integration of two ingredients: entropy regularized policy and Vaidya's dual.
The proposed approach is shown to converge (with linear rate) to the global optimum.
arXiv Detail & Related papers (2022-06-03T16:26:38Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - Efficient Methods for Structured Nonconvex-Nonconcave Min-Max
Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point.
The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z) - A Feasible Level Proximal Point Method for Nonconvex Sparse Constrained
Optimization [25.73397307080647]
We present a new model of a general convex or non objective machine machine objectives.
We propose an algorithm that solves a constraint with gradually relaxed point levels of each subproblem.
We demonstrate the effectiveness of our new numerical scale problems.
arXiv Detail & Related papers (2020-10-23T05:24:05Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.