Related papers: Fast-Fourier-Forecasting Resource Utilisation in Distributed Systems

Fast-Fourier-Forecasting Resource Utilisation in Distributed Systems

URL: http://arxiv.org/abs/2001.04281v3
Date: Fri, 7 Aug 2020 14:51:19 GMT
Title: Fast-Fourier-Forecasting Resource Utilisation in Distributed Systems
Authors: Paul J. Pritz and Daniel Perez and Kin K. Leung
Abstract summary: We present a communication-efficient data collection mechanism for distributed computing systems. We also propose a deep learning architecture using complex Gated Recurrent Units to forecast resource utilisation. Our approach resolves challenges encountered in resource provisioning frameworks and can be applied to other forecasting problems.
Score: 10.219353459640137
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Distributed computing systems often consist of hundreds of nodes, executing tasks with different resource requirements. Efficient resource provisioning and task scheduling in such systems are non-trivial and require close monitoring and accurate forecasting of the state of the system, specifically resource utilisation at its constituent machines. Two challenges present themselves towards these objectives. First, collecting monitoring data entails substantial communication overhead. This overhead can be prohibitively high, especially in networks where bandwidth is limited. Second, forecasting models to predict resource utilisation should be accurate and need to exhibit high inference speed. Mission critical scheduling and resource allocation algorithms use these predictions and rely on their immediate availability. To address the first challenge, we present a communication-efficient data collection mechanism. Resource utilisation data is collected at the individual machines in the system and transmitted to a central controller in batches. Each batch is processed by an adaptive data-reduction algorithm based on Fourier transforms and truncation in the frequency domain. We show that the proposed mechanism leads to a significant reduction in communication overhead while incurring only minimal error and adhering to accuracy guarantees. To address the second challenge, we propose a deep learning architecture using complex Gated Recurrent Units to forecast resource utilisation. This architecture is directly integrated with the above data collection mechanism to improve inference speed of our forecasting model. Using two real-world datasets, we demonstrate the effectiveness of our approach, both in terms of forecasting accuracy and inference speed. Our approach resolves challenges encountered in resource provisioning frameworks and can be applied to other forecasting problems.

Related papers

Interference-Aware Edge Runtime Prediction with Conformal Matrix Completion [10.776912158818437]
Accurately estimating workload runtime is a longstanding goal in computer systems. We develop a matrix factorization-inspired method that generates accurate interference-aware predictions with tight provably-guaranteed uncertainty bounds. We validate our method on a novel WebAssembly runtime dataset collected from 24 unique devices, achieving a prediction error of 5.2% -- 2x better than a naive application of existing methods.
arXiv Detail & Related papers (2025-03-09T03:41:32Z)
Stream-Based Monitoring of Algorithmic Fairness [4.811789437743092]
Stream-based monitoring is proposed as a solution for verifying the algorithmic fairness of decision and prediction systems at runtime. We present a principled way to formalize algorithmic fairness over temporal data streams in the specification language RTLola.
arXiv Detail & Related papers (2025-01-30T13:18:59Z)
Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting. Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server. We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z)
PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a. Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns. We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z)
Rethinking Resource Management in Edge Learning: A Joint Pre-training and Fine-tuning Design Paradigm [87.47506806135746]
In some applications, edge learning is experiencing a shift in focusing from conventional learning from scratch to new two-stage learning. This paper considers the problem of joint communication and computation resource management in a two-stage edge learning system. It is shown that the proposed joint resource management over the pre-training and fine-tuning stages well balances the system performance trade-off.
arXiv Detail & Related papers (2024-04-01T00:21:11Z)
Perceiver-based CDF Modeling for Time Series Forecasting [25.26713741799865]
We propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data. Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction. Experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods.
arXiv Detail & Related papers (2023-10-03T01:13:17Z)
Temporal Patience: Efficient Adaptive Deep Learning for Embedded Radar Data Processing [4.359030177348051]
This paper presents novel techniques that leverage the temporal correlation present in streaming radar data to enhance the efficiency of Early Exit Neural Networks for Deep Learning inference on embedded devices. Our results demonstrate that our techniques save up to 26% of operations per inference over a Single Exit Network and 12% over a confidence-based Early Exit version. Such efficiency gains enable real-time radar data processing on resource-constrained platforms, allowing for new applications in the context of smart homes, Internet-of-Things, and human-computer interaction.
arXiv Detail & Related papers (2023-09-11T12:38:01Z)
Probabilistic Time Series Forecasting for Adaptive Monitoring in Edge Computing Environments [0.06999740786886537]
In this paper, we propose a sampling-based and cloud-located approach for monitoring critical infrastructures. We evaluate our prototype implementation for the monitoring pipeline on a publicly available streaming dataset.
arXiv Detail & Related papers (2022-11-24T17:35:14Z)
Federated Learning for Energy-limited Wireless Networks: A Partial Model Aggregation Approach [79.59560136273917]
limited communication resources, bandwidth and energy, and data heterogeneity across devices are main bottlenecks for federated learning (FL) We first devise a novel FL framework with partial model aggregation (PMA) The proposed PMA-FL improves 2.72% and 11.6% accuracy on two typical heterogeneous datasets.
arXiv Detail & Related papers (2022-04-20T19:09:52Z)
Dynamic Network-Assisted D2D-Aided Coded Distributed Learning [59.29409589861241]
We propose a novel device-to-device (D2D)-aided coded federated learning method (D2D-CFL) for load balancing across devices. We derive an optimal compression rate for achieving minimum processing time and establish its connection with the convergence time. Our proposed method is beneficial for real-time collaborative applications, where the users continuously generate training data.
arXiv Detail & Related papers (2021-11-26T18:44:59Z)
LoRD-Net: Unfolded Deep Detection Network with Low-Resolution Receivers [104.01415343139901]
We propose a deep detector entitled LoRD-Net for recovering information symbols from one-bit measurements. LoRD-Net has a task-based architecture dedicated to recovering the underlying signal of interest. We evaluate the proposed receiver architecture for one-bit signal recovery in wireless communications.
arXiv Detail & Related papers (2021-02-05T04:26:05Z)
Scaling-up Distributed Processing of Data Streams for Machine Learning [10.581140430698103]
This paper reviews recently developed methods that focus on large-scale distributed optimization in the compute- and bandwidth-limited regime. It focuses on methods that solve: (i) distributed convex problems, and (ii) distributed principal component analysis, which is a non problem with geometric structure that permits global convergence.
arXiv Detail & Related papers (2020-05-18T16:28:54Z)
Coordinates-based Resource Allocation Through Supervised Machine Learning [14.014514995022182]
We propose a coordinates-based resource allocation scheme using supervised machine learning techniques. The proposed scheme performs consistently well with realistic-system simulation, requiring only 4 s of training time.
arXiv Detail & Related papers (2020-05-13T18:33:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.