Related papers: Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process

Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process

URL: http://arxiv.org/abs/2110.10351v1
Date: Wed, 20 Oct 2021 02:57:21 GMT
Title: Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process
Authors: Tianjiao Li, Ziwei Guan, Shaofeng Zou, Tengyu Xu, Yingbin Liang and Guanghui Lan
Abstract summary: The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints. A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual. This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
Score: 56.55075925645864
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The problem of constrained Markov decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its utilities/costs. A new primal-dual approach is proposed with a novel integration of three ingredients: entropy regularized policy optimizer, dual variable regularizer, and Nesterov's accelerated gradient descent dual optimizer, all of which are critical to achieve a faster convergence. The finite-time error bound of the proposed approach is characterized. Despite the challenge of the nonconcave objective subject to nonconcave constraints, the proposed approach is shown to converge to the global optimum with a complexity of $\tilde{\mathcal O}(1/\epsilon)$ in terms of the optimality gap and the constraint violation, which improves the complexity of the existing primal-dual approach by a factor of $\mathcal O(1/\epsilon)$ \citep{ding2020natural,paternain2019constrained}. This is the first demonstration that nonconcave CMDP problems can attain the complexity lower bound of $\mathcal O(1/\epsilon)$ for convex optimization subject to convex constraints. Our primal-dual approach and non-asymptotic analysis are agnostic to the RL optimizer used, and thus are more flexible for practical applications. More generally, our approach also serves as the first algorithm that provably accelerates constrained nonconvex optimization with zero duality gap by exploiting the geometries such as the gradient dominance condition, for which the existing acceleration methods for constrained convex optimization are not applicable.

Related papers

Scalable First-order Method for Certifying Optimal k-Sparse GLMs [9.613635592922174]
We propose a first-order proximal gradient algorithm to solve the perspective relaxation of the problem within a BnB framework. We show that our approach significantly accelerates dual bound computations and is highly effective in providing optimality certificates for large-scale problems.
arXiv Detail & Related papers (2025-02-13T17:14:18Z)
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure. Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z)
Low-Rank Extragradient Methods for Scalable Semidefinite Optimization [0.0]
We focus on high-dimensional and plausible settings in which the problem admits a low-rank solution. We provide several theoretical results proving that, under these circumstances, the well-known Extragradient method converges to a solution of the constrained optimization problem.
arXiv Detail & Related papers (2024-02-14T10:48:00Z)
Online Non-convex Optimization with Long-term Non-convex Constraints [2.033434950296318]
A novel Follow-the-Perturbed-Leader type algorithm is proposed and analyzed for solving general long-term constrained optimization problems in online manner. The proposed algorithm is applied to tackle a long-term (extreme value) constrained river pollutant source identification problem.
arXiv Detail & Related papers (2023-11-04T15:08:36Z)
Algorithm for Constrained Markov Decision Process with Linear Convergence [55.41644538483948]
An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs. A new dual approach is proposed with the integration of two ingredients: entropy regularized policy and Vaidya's dual. The proposed approach is shown to converge (with linear rate) to the global optimum.
arXiv Detail & Related papers (2022-06-03T16:26:38Z)
Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z)
Efficient Methods for Structured Nonconvex-Nonconcave Min-Max Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point. The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z)
A Feasible Level Proximal Point Method for Nonconvex Sparse Constrained Optimization [25.73397307080647]
We present a new model of a general convex or non objective machine machine objectives. We propose an algorithm that solves a constraint with gradually relaxed point levels of each subproblem. We demonstrate the effectiveness of our new numerical scale problems.
arXiv Detail & Related papers (2020-10-23T05:24:05Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.