Related papers: Algorithm for Constrained Markov Decision Process with Linear Convergence

Algorithm for Constrained Markov Decision Process with Linear Convergence

URL: http://arxiv.org/abs/2206.01666v1
Date: Fri, 3 Jun 2022 16:26:38 GMT
Title: Algorithm for Constrained Markov Decision Process with Linear Convergence
Authors: Egor Gladin, Maksim Lavrik-Karmazin, Karina Zainullina, Varvara Rudenko, Alexander Gasnikov, Martin Tak\'a\v{c}
Abstract summary: An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs. A new dual approach is proposed with the integration of two ingredients: entropy regularized policy and Vaidya's dual. The proposed approach is shown to converge (with linear rate) to the global optimum.
Score: 55.41644538483948
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The problem of constrained Markov decision process is considered. An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs (the number of constraints is relatively small). A new dual approach is proposed with the integration of two ingredients: entropy regularized policy optimizer and Vaidya's dual optimizer, both of which are critical to achieve faster convergence. The finite-time error bound of the proposed approach is provided. Despite the challenge of the nonconcave objective subject to nonconcave constraints, the proposed approach is shown to converge (with linear rate) to the global optimum. The complexity expressed in terms of the optimality gap and the constraint violation significantly improves upon the existing primal-dual approaches.

Related papers

Optimization over Sparse Support-Preserving Sets: Two-Step Projection with Global Optimality Guarantees [34.164821598251315]
In sparse optimization, enforcing hard constraints using the $ell_$ pseudo-norm offers advantages like controlled sparsity.<n>Many real-world applications demand not only sparsity constraints but also some extra constraints.<n>We present a new variant of hard-preserving iterative algorithm equipped with a two-step projection operator customized for these mixed constraints.
arXiv Detail & Related papers (2025-06-10T08:27:01Z)
Single-loop Algorithms for Stochastic Non-convex Optimization with Weakly-Convex Constraints [49.76332265680669]
This paper examines a crucial subset of problems where both the objective and constraint functions are weakly convex. Existing methods often face limitations, including slow convergence rates or reliance on double-loop designs. We introduce a novel single-loop penalty-based algorithm to overcome these challenges.
arXiv Detail & Related papers (2025-04-21T17:15:48Z)
The inexact power augmented Lagrangian method for constrained nonconvex optimization [44.516958213972885]
This work introduces an unconventional augmented Lagrangian term, where the augmenting term is a Euclidean norm raised to a power. We show that using lower powers for augmenting term to faster rate, albeit with a slower decrease in residual. Our results further show that using lower powers for augmenting term to faster rate, albeit with a slower decrease in residual.
arXiv Detail & Related papers (2024-10-26T11:31:56Z)
A Double Tracking Method for Optimization with Decentralized Generalized Orthogonality Constraints [4.6796315389639815]
Decentralized optimization problems are unsolvable in the presence of distributed constraints. We introduce a novel algorithm that tracks the gradient of the objective function and the Jacobian of the constraint mapping simultaneously. We present numerical results on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-09-08T06:57:35Z)
Two-Step QAOA: Enhancing Quantum Optimization by Decomposing One-Hot Constraints in QUBO Formulations [0.0]
We propose a simple approach, the Two-Step QAOA, which aims to improve the effectiveness of QAOA. By identifying and separating the problem into two stages, we transform soft constraints into hard constraints.
arXiv Detail & Related papers (2024-08-09T23:38:28Z)
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure. Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z)
Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities. We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability. The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z)
Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints. A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual. This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z)
A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization [7.483040617090451]
We study entropy-regularized constrained Markov decision processes (CMDPs) under the soft-max parameterization. Our theoretical analysis shows that its Lagrangian dual function is smooth and the Lagrangian duality gap can be decomposed into the primality gap and the constraint violation.
arXiv Detail & Related papers (2021-10-17T21:26:40Z)
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO. We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size. Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z)
Efficient Methods for Structured Nonconvex-Nonconcave Min-Max Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point. The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.