Related papers: Augmented Lagrangian Methods for Time-varying Constrained Online Convex Optimization

Augmented Lagrangian Methods for Time-varying Constrained Online Convex Optimization

URL: http://arxiv.org/abs/2205.09571v1
Date: Thu, 19 May 2022 14:03:25 GMT
Title: Augmented Lagrangian Methods for Time-varying Constrained Online Convex Optimization
Authors: Haoyang Liu and Xiantao Xiao and Liwei Zhang
Abstract summary: We consider online convex optimization (OCO) with time-varying loss and constraint functions. We first develop a class of model-based augmented Lagrangian methods (MALM) for time-varying functional constrained OCO. numerical results for several examples of constrained OCO are presented to demonstrate the efficiency of the proposed algorithms.
Score: 1.662966122370634
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we consider online convex optimization (OCO) with time-varying loss and constraint functions. Specifically, the decision maker chooses sequential decisions based only on past information, meantime the loss and constraint functions are revealed over time. We first develop a class of model-based augmented Lagrangian methods (MALM) for time-varying functional constrained OCO (without feedback delay). Under standard assumptions, we establish sublinear regret and sublinear constraint violation of MALM. Furthermore, we extend MALM to deal with time-varying functional constrained OCO with delayed feedback, in which the feedback information of loss and constraint functions is revealed to decision maker with delays. Without additional assumptions, we also establish sublinear regret and sublinear constraint violation for the delayed version of MALM. Finally, numerical results for several examples of constrained OCO including online network resource allocation, online logistic regression and online quadratically constrained quadratical program are presented to demonstrate the efficiency of the proposed algorithms.

Related papers

Length-Controlled Margin-Based Preference Optimization without Reference Model [11.878496378814045]
We propose Length-Controlled Margin-Based Preference Optimization (LMPO) for preference-based reinforcement learning. A key innovation of LMPO lies in its Length-Controlled Margin-Based loss function, integrated within the Bradley-Terry framework. Our experimental results demonstrate that LMPO effectively controls response length, reduces probability degradation, and outperforms existing approaches.
arXiv Detail & Related papers (2025-02-20T15:30:27Z)
Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification [76.14641982122696]
We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control. We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task.
arXiv Detail & Related papers (2024-10-07T23:38:58Z)
Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains. Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint. This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions. The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z)
Regret Analysis of Policy Optimization over Submanifolds for Linearly Constrained Online LQG [12.201535821920624]
We study online linear quadratic Gaussian problems with a given linear constraint imposed on the controller. We propose online optimistic Newton on manifold (OONM) which provides an online controller based on the prediction on the first and second order information of the function sequence.
arXiv Detail & Related papers (2024-03-13T14:06:18Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Primal-Dual Contextual Bayesian Optimization for Control System Online Optimization with Time-Average Constraints [21.38692458445459]
This paper studies the problem of online performance optimization of constrained closed-loop control systems. A primal-dual contextual Bayesian optimization algorithm is proposed that achieves sublinear cumulative regret with respect to the dynamic optimal solution.
arXiv Detail & Related papers (2023-04-12T18:37:52Z)
Learning-Assisted Algorithm Unrolling for Online Optimization with Budget Constraints [27.84415856657607]
We propose a new machine learning (ML) assisted unrolling approach, called LAAU (Learning-Assisted Algorithm Unrolling) For efficient training via backpropagation, we derive gradients of the decision pipeline over time. We also provide the average cost bounds for two cases when training data is available offline and collected online, respectively.
arXiv Detail & Related papers (2022-12-03T20:56:29Z)
Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities. We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability. The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z)
Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback [98.7678704343537]
We focus on a class of nonsubmodular functions with special structure, and prove regret guarantees for several variants of the online and approximate online bandit gradient descent algorithms. We derive bounds for the agent's regret in the full information and bandit feedback setting, even if the delay between choosing a decision and receiving the incurred cost is unbounded.
arXiv Detail & Related papers (2022-05-15T08:27:12Z)
Conservative Distributional Reinforcement Learning with Safety Constraints [22.49025480735792]
Safety exploration can be regarded as a constrained Markov decision problem where the expected long-term cost is constrained. Previous off-policy algorithms convert the constrained optimization problem into the corresponding unconstrained dual problem by introducing the Lagrangian relaxation technique. We present a novel off-policy reinforcement learning algorithm called Conservative Distributional Maximum a Posteriori Policy Optimization.
arXiv Detail & Related papers (2022-01-18T19:45:43Z)
Delay-Tolerant Constrained OCO with Application to Network Resource Allocation [44.67787270821051]
We consider online convex optimization (OCO) with multi-slot feedback delay. An agent makes a sequence of online decisions to minimize the accumulation of time-varying convex loss functions. We propose Delay-Tolerant Constrained-OCO, which uses a novel constraint penalty with double regularization to tackle the asynchrony between information feedback and decision updates.
arXiv Detail & Related papers (2021-05-09T19:32:33Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.