Related papers: Predict-and-Critic: Accelerated End-to-End Predictive Control for Cloud Computing through Reinforcement Learning

Predict-and-Critic: Accelerated End-to-End Predictive Control for Cloud Computing through Reinforcement Learning

URL: http://arxiv.org/abs/2212.01348v1
Date: Fri, 2 Dec 2022 18:07:56 GMT
Title: Predict-and-Critic: Accelerated End-to-End Predictive Control for Cloud Computing through Reinforcement Learning
Authors: Kaustubh Sridhar, Vikramank Singh, Balakrishnan Narayanaswamy, Abishek Sankararaman
Abstract summary: We introduce an approximate formulation of an industrial VM packing problem as an MILP with soft-constraints parameterized by predictions. We propose the Predict-and-Critic (PnC) framework that outperforms predict-and-optimize (PnO) with just a two-step horizon. We find that PnC significantly improves decision quality over PnO, even when the optimization problem is not a perfect representation of reality.
Score: 8.573878018370547
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cloud computing holds the promise of reduced costs through economies of scale. To realize this promise, cloud computing vendors typically solve sequential resource allocation problems, where customer workloads are packed on shared hardware. Virtual machines (VM) form the foundation of modern cloud computing as they help logically abstract user compute from shared physical infrastructure. Traditionally, VM packing problems are solved by predicting demand, followed by a Model Predictive Control (MPC) optimization over a future horizon. We introduce an approximate formulation of an industrial VM packing problem as an MILP with soft-constraints parameterized by the predictions. Recently, predict-and-optimize (PnO) was proposed for end-to-end training of prediction models by back-propagating the cost of decisions through the optimization problem. But, PnO is unable to scale to the large prediction horizons prevalent in cloud computing. To tackle this issue, we propose the Predict-and-Critic (PnC) framework that outperforms PnO with just a two-step horizon by leveraging reinforcement learning. PnC jointly trains a prediction model and a terminal Q function that approximates cost-to-go over a long horizon, by back-propagating the cost of decisions through the optimization problem \emph{and from the future}. The terminal Q function allows us to solve a much smaller two-step horizon optimization problem than the multi-step horizon necessary in PnO. We evaluate PnO and the PnC framework on two datasets, three workloads, and with disturbances not modeled in the optimization problem. We find that PnC significantly improves decision quality over PnO, even when the optimization problem is not a perfect representation of reality. We also find that hardening the soft constraints of the MILP and back-propagating through the constraints improves decision quality for both PnO and PnC.

Related papers

Opportunistic Collaborative Planning with Large Vision Model Guided Control and Joint Query-Service Optimization [74.92515821144484]
Navigating autonomous vehicles in open scenarios is a challenge due to the difficulties in handling unseen objects. Existing solutions either rely on small models that struggle with generalization or large models that are resource-intensive. This paper proposes opportunistic collaborative planning (OCP), which seamlessly integrates efficient local models with powerful cloud models.
arXiv Detail & Related papers (2025-04-25T04:07:21Z)
D$^{2}$MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving [14.607254882119507]
Combination of experts (MoE) model is a sparse variant of large language models (LLMs) Despite its benefits, MoE is still too expensive to deploy on resource-constrained edge devices. We propose D$2$MoE, an algorithm-system co-design framework that matches diverse task requirements by dynamically allocating the most proper bit-width to each expert.
arXiv Detail & Related papers (2025-04-17T05:37:35Z)
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality. This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives. It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z)
Benchmarking PtO and PnO Methods in the Predictive Combinatorial Optimization Regime [59.27851754647913]
Predictive optimization is the precise modeling of many real-world applications, including energy cost-aware scheduling and budget allocation on advertising. We develop a modular framework to benchmark 11 existing PtO/PnO methods on 8 problems, including a new industrial dataset for advertising. Our study shows that PnO approaches are better than PtO on 7 out of 8 benchmarks, but there is no silver bullet found for the specific design choices of PnO.
arXiv Detail & Related papers (2023-11-13T13:19:34Z)
Model-based Causal Bayesian Optimization [74.78486244786083]
We introduce the first algorithm for Causal Bayesian Optimization with Multiplicative Weights (CBO-MW) We derive regret bounds for CBO-MW that naturally depend on graph-related quantities. Our experiments include a realistic demonstration of how CBO-MW can be used to learn users' demand patterns in a shared mobility system.
arXiv Detail & Related papers (2023-07-31T13:02:36Z)
CILP: Co-simulation based Imitation Learner for Dynamic Resource Provisioning in Cloud Computing Environments [13.864161788250856]
Key challenge for latency-critical tasks is to predict future workload demands to provision proactively. Existing AI-based solutions tend to not holistically consider all crucial aspects such as provision overheads, heterogeneous VM costs and Quality of Service (QoS) of the cloud system. We propose a novel method, called CILP, that formulates the VM provisioning problem as two sub-problems of prediction and optimization.
arXiv Detail & Related papers (2023-02-11T09:15:34Z)
Differentially Private Deep Q-Learning for Pattern Privacy Preservation in MEC Offloading [76.0572817182483]
attackers may eavesdrop on the offloading decisions to infer the edge server's (ES's) queue information and users' usage patterns. We propose an offloading strategy which jointly minimizes the latency, ES's energy consumption, and task dropping rate, while preserving pattern privacy (PP) We develop a Differential Privacy Deep Q-learning based Offloading (DP-DQO) algorithm to solve this problem while addressing the PP issue by injecting noise into the generated offloading decisions.
arXiv Detail & Related papers (2023-02-09T12:50:18Z)
PECCO: A Profit and Cost-oriented Computation Offloading Scheme in Edge-Cloud Environment with Improved Moth-flame Optimisation [22.673319784715172]
Edge-cloud computation offloading is a promising solution to relieve the burden on cloud centres. We propose an improved Moth-flame optimiser PECCO-MFI which addresses some deficiencies of the original Moth-flame Optimiser.
arXiv Detail & Related papers (2022-08-09T23:26:42Z)
Lazy Lagrangians with Predictions for Online Learning [24.18464455081512]
We consider the general problem of online convex optimization with time-varying additive constraints. A novel primal-dual algorithm is designed by combining a Follow-The-Regularized-Leader iteration with prediction-adaptive dynamic steps. Our work extends the FTRL framework for this constrained OCO setting and outperforms the respective state-of-the-art greedy-based solutions.
arXiv Detail & Related papers (2022-01-08T21:49:10Z)
Predict and Optimize: Through the Lens of Learning to Rank [9.434400627011108]
We show the noise contrastive estimation can be considered a case of learning to rank the solution cache. We also develop pairwise and listwise ranking loss functions, which can be differentiated in closed form without the need of solving the optimization problem.
arXiv Detail & Related papers (2021-12-07T10:11:44Z)
Learning Model Predictive Controllers for Real-Time Ride-Hailing Vehicle Relocation and Pricing Decisions [15.80796896560034]
Large-scale ride-hailing systems often combine real-time routing at the individual request level with a macroscopic Model Predictive Control (MPC) optimization for dynamic pricing and vehicle relocation. This paper addresses these computational challenges by learning the MPC optimization. The resulting machine-learning model then serves as the optimization proxy and predicts its optimal solutions.
arXiv Detail & Related papers (2021-11-05T00:52:15Z)
Distributed Reinforcement Learning for Privacy-Preserving Dynamic Edge Caching [91.50631418179331]
A privacy-preserving distributed deep policy gradient (P2D3PG) is proposed to maximize the cache hit rates of devices in the MEC networks. We convert the distributed optimizations into model-free Markov decision process problems and then introduce a privacy-preserving federated learning method for popularity prediction.
arXiv Detail & Related papers (2021-10-20T02:48:27Z)
A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration Framework [56.57225686288006]
Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices. Previous pruning methods mainly focus on reducing the model size and/or improving performance without considering the privacy of user data. We propose a privacy-preserving-oriented pruning and mobile acceleration framework that does not require the private training dataset.
arXiv Detail & Related papers (2020-03-13T23:52:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.