Optimal Resource Allocation for Serverless Queries
- URL: http://arxiv.org/abs/2107.08594v1
- Date: Mon, 19 Jul 2021 02:55:48 GMT
- Title: Optimal Resource Allocation for Serverless Queries
- Authors: Anish Pimpley, Shuo Li, Anubha Srivastava, Vishal Rohra, Yi Zhu,
Soundararajan Srinivasan, Alekh Jindal, Hiren Patel, Shi Qiao, Rathijit Sen
- Abstract summary: Prior work focused on predicting peak allocation while ignoring aggressive trade-offs between resource allocation and run-time.
We introduce a system for optimal resource allocation that can predict performance with aggressive trade-offs, for both new and past observed queries.
- Score: 8.59568779761598
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Optimizing resource allocation for analytical workloads is vital for reducing
costs of cloud-data services. At the same time, it is incredibly hard for users
to allocate resources per query in serverless processing systems, and they
frequently misallocate by orders of magnitude. Unfortunately, prior work
focused on predicting peak allocation while ignoring aggressive trade-offs
between resource allocation and run-time. Additionally, these methods fail to
predict allocation for queries that have not been observed in the past. In this
paper, we tackle both these problems. We introduce a system for optimal
resource allocation that can predict performance with aggressive trade-offs,
for both new and past observed queries. We introduce the notion of a
performance characteristic curve (PCC) as a parameterized representation that
can compactly capture the relationship between resources and performance. To
tackle training data sparsity, we introduce a novel data augmentation technique
to efficiently synthesize the entire PCC using a single run of the query.
Lastly, we demonstrate the advantages of a constrained loss function coupled
with GNNs, over traditional ML methods, for capturing the domain specific
behavior through an extensive experimental evaluation over SCOPE big data
workloads at Microsoft.
Related papers
- Topology-aware Preemptive Scheduling for Co-located LLM Workloads [7.240168647854797]
We develop a fine-grained topology-aware method for scheduling of hybrid workloads.
This method significantly increases the efficiency of preemption and improves overall scheduled performance for LLM workloads by $55%$.
arXiv Detail & Related papers (2024-11-18T13:26:09Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems [67.52782366565658]
State-of-the-art recommender systems (RSs) depend on categorical features, which ecoded by embedding vectors, resulting in excessively large embedding tables.
Despite the prosperity of lightweight embedding-based RSs, a wide diversity is seen in evaluation protocols.
This study investigates various LERS' performance, efficiency, and cross-task transferability via a thorough benchmarking process.
arXiv Detail & Related papers (2024-06-25T07:45:00Z) - Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance.
Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Analysis and Optimization of Wireless Federated Learning with Data
Heterogeneity [72.85248553787538]
This paper focuses on performance analysis and optimization for wireless FL, considering data heterogeneity, combined with wireless resource allocation.
We formulate the loss function minimization problem, under constraints on long-term energy consumption and latency, and jointly optimize client scheduling, resource allocation, and the number of local training epochs (CRE)
Experiments on real-world datasets demonstrate that the proposed algorithm outperforms other benchmarks in terms of the learning accuracy and energy consumption.
arXiv Detail & Related papers (2023-08-04T04:18:01Z) - Outage Performance and Novel Loss Function for an ML-Assisted Resource
Allocation: An Exact Analytical Framework [2.1397655110395752]
We introduce a novel loss function to minimize the outage probability of an ML-based resource allocation system.
An ML binary classification predictor assists in selecting a resource satisfying the established outage criterium.
arXiv Detail & Related papers (2023-05-16T18:23:52Z) - Compressed Regression over Adaptive Networks [58.79251288443156]
We derive the performance achievable by a network of distributed agents that solve, adaptively and in the presence of communication constraints, a regression problem.
We devise an optimized allocation strategy where the parameters necessary for the optimization can be learned online by the agents.
arXiv Detail & Related papers (2023-04-07T13:41:08Z) - Perona: Robust Infrastructure Fingerprinting for Resource-Efficient Big
Data Analytics [0.06524460254566904]
We present Perona, a novel approach to robust infrastructure fingerprinting for usage in big data analytics.
Perona employs common sets and configurations of benchmarking tools for target resources, so that resulting benchmark metrics are directly comparable and ranking is enabled.
We evaluate our approach both on data gathered from our own experiments as well as within related works for resource configuration optimization.
arXiv Detail & Related papers (2022-11-15T15:48:09Z) - On the Potential of Execution Traces for Batch Processing Workload
Optimization in Public Clouds [0.0]
We propose a collaborative approach for sharing anonymized workload execution traces among users.
We mining them for general patterns, and exploiting clusters of historical workloads for future optimizations.
arXiv Detail & Related papers (2021-11-16T20:11:36Z) - A Predictive Autoscaler for Elastic Batch Jobs [8.354712625979776]
Large batch jobs such as Deep Learning, HPC and Spark require far more computational resources and higher cost than conventional online service.
We propose a predictive autoscaler to provide an elastic interface for the customers and overprovision instances.
arXiv Detail & Related papers (2020-10-10T17:35:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.