Related papers: Expert-Calibrated Learning for Online Optimization with Switching Costs

Expert-Calibrated Learning for Online Optimization with Switching Costs

URL: http://arxiv.org/abs/2204.08572v1
Date: Mon, 18 Apr 2022 21:54:33 GMT
Title: Expert-Calibrated Learning for Online Optimization with Switching Costs
Authors: Pengfei Li and Jianyi Yang and Shaolei Ren
Abstract summary: We study online convex optimization with switching costs. By tapping into the power of machine learning (ML) baseds, ML-augmented online algorithms have been emerging as state of the art. We propose EC-L2O (expert-calibrated learning to optimize), which trains an ML-based algorithm by explicitly taking into account the expert calibrator.
Score: 28.737193318136725
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study online convex optimization with switching costs, a practically important but also extremely challenging problem due to the lack of complete offline information. By tapping into the power of machine learning (ML) based optimizers, ML-augmented online algorithms (also referred to as expert calibration in this paper) have been emerging as state of the art, with provable worst-case performance guarantees. Nonetheless, by using the standard practice of training an ML model as a standalone optimizer and plugging it into an ML-augmented algorithm, the average cost performance can be even worse than purely using ML predictions. In order to address the "how to learn" challenge, we propose EC-L2O (expert-calibrated learning to optimize), which trains an ML-based optimizer by explicitly taking into account the downstream expert calibrator. To accomplish this, we propose a new differentiable expert calibrator that generalizes regularized online balanced descent and offers a provably better competitive ratio than pure ML predictions when the prediction error is large. For training, our loss function is a weighted sum of two different losses -- one minimizing the average ML prediction error for better robustness, and the other one minimizing the post-calibration average cost. We also provide theoretical analysis for EC-L2O, highlighting that expert calibration can be even beneficial for the average cost performance and that the high-percentile tail ratio of the cost achieved by EC-L2O to that of the offline optimal oracle (i.e., tail cost ratio) can be bounded. Finally, we test EC-L2O by running simulations for sustainable datacenter demand response. Our results demonstrate that EC-L2O can empirically achieve a lower average cost as well as a lower competitive ratio than the existing baseline algorithms.

Related papers

Accelerating RL for LLM Reasoning with Optimal Advantage Regression [52.0792918455501]
We propose a novel two-stage policy optimization framework that directly approximates the optimal advantage function.<n>$A$*-PO achieves competitive performance across a wide range of mathematical reasoning benchmarks.<n>It reduces training time by up to 2$times$ and peak memory usage by over 30% compared to PPO, GRPO, and REBEL.
arXiv Detail & Related papers (2025-05-27T03:58:50Z)
COSMOS: Predictable and Cost-Effective Adaptation of LLMs [21.91455944905485]
Large language models (LLMs) achieve remarkable performance across numerous tasks by using a diverse array of adaptation strategies.<n>We introduce COSMOS, a unified prediction framework that efficiently estimates adaptation outcomes at minimal cost.
arXiv Detail & Related papers (2025-04-30T02:06:26Z)
Supervised Optimism Correction: Be Confident When LLMs Are Sure [91.7459076316849]
We establish a novel theoretical connection between supervised fine-tuning and offline reinforcement learning. We show that the widely used beam search method suffers from unacceptable over-optimism. We propose Supervised Optimism Correction, which introduces a simple yet effective auxiliary loss for token-level $Q$-value estimations.
arXiv Detail & Related papers (2025-04-10T07:50:03Z)
RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [53.571195477043496]
We propose an algorithm named Rotated Straight-Through-Estimator (RoSTE) RoSTE combines quantization-aware supervised fine-tuning (QA-SFT) with an adaptive rotation strategy to reduce activation outliers. Our findings reveal that the prediction error is directly proportional to the quantization error of the converged weights, which can be effectively managed through an optimized rotation configuration.
arXiv Detail & Related papers (2025-02-13T06:44:33Z)
A hybrid framework for effective and efficient machine unlearning [12.499101994047862]
Machine unlearning (MU) is proposed to remove the imprints of revoked samples from the already trained model parameters. We present a novel hybrid strategy on top of them to achieve an overall success.
arXiv Detail & Related papers (2024-12-19T03:59:26Z)
Scaling Laws for Predicting Downstream Performance in LLMs [75.28559015477137]
This work focuses on the pre-training loss as a more-efficient metric for performance estimation. We extend the power law analytical function to predict domain-specific pre-training loss based on FLOPs across data sources. We employ a two-layer neural network to model the non-linear relationship between multiple domain-specific loss and downstream performance.
arXiv Detail & Related papers (2024-10-11T04:57:48Z)
Pricing American Options using Machine Learning Algorithms [0.0]
This study investigates the application of machine learning algorithms to pricing American options using Monte Carlo simulations. Traditional models, such as the Black-Scholes-Merton framework, often fail to adequately address the complexities of American options. By leveraging Monte Carlo methods in conjunction with Least Square Method machine learning was used.
arXiv Detail & Related papers (2024-09-05T02:52:11Z)
Decision Focused Causal Learning for Direct Counterfactual Marketing Optimization [21.304040539486184]
Decision Focused Learning (DFL) integrates machine learning (ML) and optimization into an end-to-end framework. However, deploying DFL in marketing is non-trivial due to multiple technological challenges. We propose a decision focused causal learning framework (DFCL) for direct counterfactual marketing.
arXiv Detail & Related papers (2024-07-18T16:39:44Z)
Learning-augmented Online Minimization of Age of Information and Transmission Costs [24.873041306990288]
We develop an online algorithm to minimize the sum of transmission and staleness costs, ensuring a worst-case performance guarantee. While online algorithms are robust, they are usually overly conservative and may have a poor average performance in typical scenarios. We show that our learning-augmented algorithm achieves both consistency and robustness.
arXiv Detail & Related papers (2024-03-05T01:06:25Z)
Robust Learning for Smoothed Online Convex Optimization with Feedback Delay [43.85262428603507]
We propose a novel machine learning (ML) augmented online algorithm, Robustness-Constrained Learning (RCL) RCL combines untrusted ML predictions with a trusted expert online algorithm via constrained projection to robustify the ML prediction. RCL is the first ML-augmented algorithm with a provable robustness guarantee in the case of multi-step switching cost and feedback delay.
arXiv Detail & Related papers (2023-10-31T00:22:55Z)
Robustified Learning for Online Optimization with Memory Costs [28.737193318136725]
We propose a novel expert-robustified learning (ERL) approach, achieving both good average performance and robustness. For any $lambdageq1$, ERL can achieve $lambda$-competitive against the expert algorithm and $lambdacdot C$-competitive against the optimal offline algorithm.
arXiv Detail & Related papers (2023-05-01T06:12:01Z)
M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation [145.7321032755538]
Learning to Optimize (L2O) has drawn increasing attention as it often remarkably accelerates the optimization procedure of complex tasks. This paper investigates a potential solution to this open challenge by meta-training an L2O that can perform fast test-time self-adaptation to an out-of-distribution task.
arXiv Detail & Related papers (2023-02-28T19:23:20Z)
Learning to Generalize Provably in Learning to Optimize [185.71326306329678]
Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizees by data-driven approaches. Current L2O methods often suffer from poor generalization performance in at least two folds. We propose to incorporate these two metrics as flatness-aware regularizers into the L2O framework.
arXiv Detail & Related papers (2023-02-22T01:17:31Z)
Learning to Optimize: A Primer and A Benchmark [94.29436694770953]
Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods. This article is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization.
arXiv Detail & Related papers (2021-03-23T20:46:20Z)
Optimization-driven Machine Learning for Intelligent Reflecting Surfaces Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts. Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity. In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.