Related papers: Robust Learning for Smoothed Online Convex Optimization with Feedback Delay

Robust Learning for Smoothed Online Convex Optimization with Feedback Delay

URL: http://arxiv.org/abs/2310.20098v1
Date: Tue, 31 Oct 2023 00:22:55 GMT
Title: Robust Learning for Smoothed Online Convex Optimization with Feedback Delay
Authors: Pengfei Li, Jianyi Yang, Adam Wierman, Shaolei Ren
Abstract summary: We propose a novel machine learning (ML) augmented online algorithm, Robustness-Constrained Learning (RCL) RCL combines untrusted ML predictions with a trusted expert online algorithm via constrained projection to robustify the ML prediction. RCL is the first ML-augmented algorithm with a provable robustness guarantee in the case of multi-step switching cost and feedback delay.
Score: 43.85262428603507
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We study a challenging form of Smoothed Online Convex Optimization, a.k.a. SOCO, including multi-step nonlinear switching costs and feedback delay. We propose a novel machine learning (ML) augmented online algorithm, Robustness-Constrained Learning (RCL), which combines untrusted ML predictions with a trusted expert online algorithm via constrained projection to robustify the ML prediction. Specifically,we prove that RCL is able to guarantee$(1+\lambda)$-competitiveness against any given expert for any$\lambda>0$, while also explicitly training the ML model in a robustification-aware manner to improve the average-case performance. Importantly,RCL is the first ML-augmented algorithm with a provable robustness guarantee in the case of multi-step switching cost and feedback delay.We demonstrate the improvement of RCL in both robustness and average performance using battery management for electrifying transportationas a case study.

Related papers

Supervised Optimism Correction: Be Confident When LLMs Are Sure [91.7459076316849]
We establish a novel theoretical connection between supervised fine-tuning and offline reinforcement learning. We show that the widely used beam search method suffers from unacceptable over-optimism. We propose Supervised Optimism Correction, which introduces a simple yet effective auxiliary loss for token-level $Q$-value estimations.
arXiv Detail & Related papers (2025-04-10T07:50:03Z)
Online Scheduling for LLM Inference with KV Cache Constraints [22.155429544207827]
Large Language Model (LLM) inference is an intensive process requiring efficient scheduling to optimize latency and resource utilization. We propose novel and scheduling algorithms that minimize inference latency while effectively managing the KV cache's memory. Our results offer a path toward more sustainable and cost-effective LLM deployment.
arXiv Detail & Related papers (2025-02-10T23:11:44Z)
Learning-augmented Online Minimization of Age of Information and Transmission Costs [24.873041306990288]
We develop an online algorithm to minimize the sum of transmission and staleness costs, ensuring a worst-case performance guarantee. While online algorithms are robust, they are usually overly conservative and may have a poor average performance in typical scenarios. We show that our learning-augmented algorithm achieves both consistency and robustness.
arXiv Detail & Related papers (2024-03-05T01:06:25Z)
Optimal Linear Signal: An Unsupervised Machine Learning Framework to Optimize PnL with Linear Signals [0.0]
This study presents an unsupervised machine learning approach for optimizing Profit and Loss (PnL) in quantitative finance. Our algorithm, akin to an unsupervised variant of linear regression, maximizes the Sharpe Ratio of PnL generated from signals constructed linearly from external variables.
arXiv Detail & Related papers (2023-11-22T21:10:59Z)
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice [79.48432795639403]
Mirror descent value iteration (MDVI) is an abstraction of Kullback-Leibler (KL) and entropy-regularized reinforcement learning (RL) We study MDVI with linear function approximation through its sample complexity required to identify an $varepsilon$-optimal policy. We present Variance-Weighted Least-Squares MDVI, the first theoretical algorithm that achieves nearly minimax optimal sample complexity for infinite-horizon linear MDPs.
arXiv Detail & Related papers (2023-05-22T16:13:05Z)
Robustified Learning for Online Optimization with Memory Costs [28.737193318136725]
We propose a novel expert-robustified learning (ERL) approach, achieving both good average performance and robustness. For any $lambdageq1$, ERL can achieve $lambda$-competitive against the expert algorithm and $lambdacdot C$-competitive against the optimal offline algorithm.
arXiv Detail & Related papers (2023-05-01T06:12:01Z)
Expert-Calibrated Learning for Online Optimization with Switching Costs [28.737193318136725]
We study online convex optimization with switching costs. By tapping into the power of machine learning (ML) baseds, ML-augmented online algorithms have been emerging as state of the art. We propose EC-L2O (expert-calibrated learning to optimize), which trains an ML-based algorithm by explicitly taking into account the expert calibrator.
arXiv Detail & Related papers (2022-04-18T21:54:33Z)
LyaNet: A Lyapunov Framework for Training Neural ODEs [59.73633363494646]
We propose a method for training ordinary differential equations by using a control-theoretic Lyapunov condition for stability. Our approach, called LyaNet, is based on a novel Lyapunov loss formulation that encourages the inference dynamics to converge quickly to the correct prediction.
arXiv Detail & Related papers (2022-02-05T10:13:14Z)
Optimization-driven Machine Learning for Intelligent Reflecting Surfaces Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts. Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity. In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation [107.06364966905821]
We study the exploration-exploitation dilemma in the linear quadratic regulator (LQR) setting. Inspired by the extended value iteration algorithm used in optimistic algorithms for finite MDPs, we propose to relax the optimistic optimization of ofulq. We show that an $epsilon$-optimistic controller can be computed efficiently by solving at most $Obig(log (1/epsilon)big)$ Riccati equations.
arXiv Detail & Related papers (2020-07-13T16:30:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.