A Stochastic Composite Augmented Lagrangian Method For Reinforcement
Learning
- URL: http://arxiv.org/abs/2105.09716v1
- Date: Thu, 20 May 2021 13:08:06 GMT
- Title: A Stochastic Composite Augmented Lagrangian Method For Reinforcement
Learning
- Authors: Yongfeng Li, Mingming Zhao, Weijie Chen, and Zaiwen Wen
- Abstract summary: We consider the linear programming (LP) formulation for deep reinforcement learning.
The augmented Lagrangian method suffers the double-sampling obstacle in solving the LP.
A deep parameterized augment Lagrangian method is proposed.
- Score: 9.204659134755795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we consider the linear programming (LP) formulation for deep
reinforcement learning. The number of the constraints depends on the size of
state and action spaces, which makes the problem intractable in large or
continuous environments. The general augmented Lagrangian method suffers the
double-sampling obstacle in solving the LP. Namely, the conditional
expectations originated from the constraint functions and the quadratic
penalties in the augmented Lagrangian function impose difficulties in sampling
and evaluation. Motivated from the updates of the multipliers, we overcome the
obstacles in minimizing the augmented Lagrangian function by replacing the
intractable conditional expectations with the multipliers. Therefore, a deep
parameterized augment Lagrangian method is proposed. Furthermore, the
replacement provides a promising breakthrough to integrate the two steps in the
augmented Lagrangian method into a single constrained problem. A general
theoretical analysis shows that the solutions generated from a sequence of the
constrained optimizations converge to the optimal solution of the LP if the
error is controlled properly. A theoretical analysis on the quadratic penalty
algorithm under neural tangent kernel setting shows the residual can be
arbitrarily small if the parameter in network and optimization algorithm is
chosen suitably. Preliminary experiments illustrate that our method is
competitive to other state-of-the-art algorithms.
Related papers
- Learning Valid Dual Bounds in Constraint Programming: Boosted Lagrangian Decomposition with Self-Supervised Learning [2.8425837800129696]
Lagrangian decomposition (LD) is a relaxation method that provides a dual bound for constrained optimization problems.
This work presents the first generic method for learning valid dual bounds in constraint programming.
arXiv Detail & Related papers (2024-08-22T19:26:29Z) - Learning Constrained Optimization with Deep Augmented Lagrangian Methods [54.22290715244502]
A machine learning (ML) model is trained to emulate a constrained optimization solver.
This paper proposes an alternative approach, in which the ML model is trained to predict dual solution estimates directly.
It enables an end-to-end training scheme is which the dual objective is as a loss function, and solution estimates toward primal feasibility, emulating a Dual Ascent method.
arXiv Detail & Related papers (2024-03-06T04:43:22Z) - Double Duality: Variational Primal-Dual Policy Optimization for
Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure.
Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z) - Learning Lagrangian Multipliers for the Travelling Salesman Problem [12.968608204035611]
We propose an innovative unsupervised learning approach that harnesses the capabilities of graph neural networks to exploit the problem structure.
We apply this technique to the well-known Held-Karp Lagrangian relaxation for the travelling salesman problem.
In contrast to much of the existing literature, which primarily focuses on finding feasible solutions, our approach operates on the dual side, demonstrating that learning can also accelerate the proof of optimality.
arXiv Detail & Related papers (2023-12-22T17:09:34Z) - Predicting Accurate Lagrangian Multipliers for Mixed Integer Linear Programs [2.6899003881035406]
We introduce a deep learning approach that bypasses the descent, effectively amortizing the local, per instance, optimization.
We show that our approach closes up to 85% of the gap between the continuous relaxation and the best Lagrangian bound, and provides a high quality warm-start for descent based Lagrangian methods.
arXiv Detail & Related papers (2023-10-23T07:53:47Z) - Constrained Optimization via Exact Augmented Lagrangian and Randomized
Iterative Sketching [55.28394191394675]
We develop an adaptive inexact Newton method for equality-constrained nonlinear, nonIBS optimization problems.
We demonstrate the superior performance of our method on benchmark nonlinear problems, constrained logistic regression with data from LVM, and a PDE-constrained problem.
arXiv Detail & Related papers (2023-05-28T06:33:37Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Conditional gradient methods for stochastically constrained convex
minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems.
The most important feature of our framework is that only a subset of the constraints is processed at each iteration.
Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z) - Inexact and Stochastic Generalized Conditional Gradient with Augmented
Lagrangian and Proximal Step [2.0196229393131726]
We analyze inexact and versions of the CGALP algorithm developed in the authors' previous paper.
This allows one to compute some gradients, terms, and/or linear minimization oracles in an inexact fashion.
We show convergence of the Lagrangian to an optimum and feasibility of the affine constraint.
arXiv Detail & Related papers (2020-05-11T14:52:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.