Predict-and-Critic: Accelerated End-to-End Predictive Control for Cloud
Computing through Reinforcement Learning
- URL: http://arxiv.org/abs/2212.01348v1
- Date: Fri, 2 Dec 2022 18:07:56 GMT
- Title: Predict-and-Critic: Accelerated End-to-End Predictive Control for Cloud
Computing through Reinforcement Learning
- Authors: Kaustubh Sridhar, Vikramank Singh, Balakrishnan Narayanaswamy, Abishek
Sankararaman
- Abstract summary: We introduce an approximate formulation of an industrial VM packing problem as an MILP with soft-constraints parameterized by predictions.
We propose the Predict-and-Critic (PnC) framework that outperforms predict-and-optimize (PnO) with just a two-step horizon.
We find that PnC significantly improves decision quality over PnO, even when the optimization problem is not a perfect representation of reality.
- Score: 8.573878018370547
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cloud computing holds the promise of reduced costs through economies of
scale. To realize this promise, cloud computing vendors typically solve
sequential resource allocation problems, where customer workloads are packed on
shared hardware. Virtual machines (VM) form the foundation of modern cloud
computing as they help logically abstract user compute from shared physical
infrastructure. Traditionally, VM packing problems are solved by predicting
demand, followed by a Model Predictive Control (MPC) optimization over a future
horizon. We introduce an approximate formulation of an industrial VM packing
problem as an MILP with soft-constraints parameterized by the predictions.
Recently, predict-and-optimize (PnO) was proposed for end-to-end training of
prediction models by back-propagating the cost of decisions through the
optimization problem. But, PnO is unable to scale to the large prediction
horizons prevalent in cloud computing. To tackle this issue, we propose the
Predict-and-Critic (PnC) framework that outperforms PnO with just a two-step
horizon by leveraging reinforcement learning. PnC jointly trains a prediction
model and a terminal Q function that approximates cost-to-go over a long
horizon, by back-propagating the cost of decisions through the optimization
problem \emph{and from the future}. The terminal Q function allows us to solve
a much smaller two-step horizon optimization problem than the multi-step
horizon necessary in PnO. We evaluate PnO and the PnC framework on two
datasets, three workloads, and with disturbances not modeled in the
optimization problem. We find that PnC significantly improves decision quality
over PnO, even when the optimization problem is not a perfect representation of
reality. We also find that hardening the soft constraints of the MILP and
back-propagating through the constraints improves decision quality for both PnO
and PnC.
Related papers
- End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - Benchmarking PtO and PnO Methods in the Predictive Combinatorial Optimization Regime [59.27851754647913]
Predictive optimization is the precise modeling of many real-world applications, including energy cost-aware scheduling and budget allocation on advertising.
We develop a modular framework to benchmark 11 existing PtO/PnO methods on 8 problems, including a new industrial dataset for advertising.
Our study shows that PnO approaches are better than PtO on 7 out of 8 benchmarks, but there is no silver bullet found for the specific design choices of PnO.
arXiv Detail & Related papers (2023-11-13T13:19:34Z) - Model-based Causal Bayesian Optimization [74.78486244786083]
We introduce the first algorithm for Causal Bayesian Optimization with Multiplicative Weights (CBO-MW)
We derive regret bounds for CBO-MW that naturally depend on graph-related quantities.
Our experiments include a realistic demonstration of how CBO-MW can be used to learn users' demand patterns in a shared mobility system.
arXiv Detail & Related papers (2023-07-31T13:02:36Z) - CILP: Co-simulation based Imitation Learner for Dynamic Resource
Provisioning in Cloud Computing Environments [13.864161788250856]
Key challenge for latency-critical tasks is to predict future workload demands to provision proactively.
Existing AI-based solutions tend to not holistically consider all crucial aspects such as provision overheads, heterogeneous VM costs and Quality of Service (QoS) of the cloud system.
We propose a novel method, called CILP, that formulates the VM provisioning problem as two sub-problems of prediction and optimization.
arXiv Detail & Related papers (2023-02-11T09:15:34Z) - Differentially Private Deep Q-Learning for Pattern Privacy Preservation
in MEC Offloading [76.0572817182483]
attackers may eavesdrop on the offloading decisions to infer the edge server's (ES's) queue information and users' usage patterns.
We propose an offloading strategy which jointly minimizes the latency, ES's energy consumption, and task dropping rate, while preserving pattern privacy (PP)
We develop a Differential Privacy Deep Q-learning based Offloading (DP-DQO) algorithm to solve this problem while addressing the PP issue by injecting noise into the generated offloading decisions.
arXiv Detail & Related papers (2023-02-09T12:50:18Z) - PECCO: A Profit and Cost-oriented Computation Offloading Scheme in
Edge-Cloud Environment with Improved Moth-flame Optimisation [22.673319784715172]
Edge-cloud computation offloading is a promising solution to relieve the burden on cloud centres.
We propose an improved Moth-flame optimiser PECCO-MFI which addresses some deficiencies of the original Moth-flame Optimiser.
arXiv Detail & Related papers (2022-08-09T23:26:42Z) - Lazy Lagrangians with Predictions for Online Learning [24.18464455081512]
We consider the general problem of online convex optimization with time-varying additive constraints.
A novel primal-dual algorithm is designed by combining a Follow-The-Regularized-Leader iteration with prediction-adaptive dynamic steps.
Our work extends the FTRL framework for this constrained OCO setting and outperforms the respective state-of-the-art greedy-based solutions.
arXiv Detail & Related papers (2022-01-08T21:49:10Z) - Predict and Optimize: Through the Lens of Learning to Rank [9.434400627011108]
We show the noise contrastive estimation can be considered a case of learning to rank the solution cache.
We also develop pairwise and listwise ranking loss functions, which can be differentiated in closed form without the need of solving the optimization problem.
arXiv Detail & Related papers (2021-12-07T10:11:44Z) - Learning Model Predictive Controllers for Real-Time Ride-Hailing Vehicle
Relocation and Pricing Decisions [15.80796896560034]
Large-scale ride-hailing systems often combine real-time routing at the individual request level with a macroscopic Model Predictive Control (MPC) optimization for dynamic pricing and vehicle relocation.
This paper addresses these computational challenges by learning the MPC optimization.
The resulting machine-learning model then serves as the optimization proxy and predicts its optimal solutions.
arXiv Detail & Related papers (2021-11-05T00:52:15Z) - Distributed Reinforcement Learning for Privacy-Preserving Dynamic Edge
Caching [91.50631418179331]
A privacy-preserving distributed deep policy gradient (P2D3PG) is proposed to maximize the cache hit rates of devices in the MEC networks.
We convert the distributed optimizations into model-free Markov decision process problems and then introduce a privacy-preserving federated learning method for popularity prediction.
arXiv Detail & Related papers (2021-10-20T02:48:27Z) - A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration
Framework [56.57225686288006]
Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.
Previous pruning methods mainly focus on reducing the model size and/or improving performance without considering the privacy of user data.
We propose a privacy-preserving-oriented pruning and mobile acceleration framework that does not require the private training dataset.
arXiv Detail & Related papers (2020-03-13T23:52:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.