Fast-Fourier-Forecasting Resource Utilisation in Distributed Systems
- URL: http://arxiv.org/abs/2001.04281v3
- Date: Fri, 7 Aug 2020 14:51:19 GMT
- Title: Fast-Fourier-Forecasting Resource Utilisation in Distributed Systems
- Authors: Paul J. Pritz and Daniel Perez and Kin K. Leung
- Abstract summary: We present a communication-efficient data collection mechanism for distributed computing systems.
We also propose a deep learning architecture using complex Gated Recurrent Units to forecast resource utilisation.
Our approach resolves challenges encountered in resource provisioning frameworks and can be applied to other forecasting problems.
- Score: 10.219353459640137
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Distributed computing systems often consist of hundreds of nodes, executing
tasks with different resource requirements. Efficient resource provisioning and
task scheduling in such systems are non-trivial and require close monitoring
and accurate forecasting of the state of the system, specifically resource
utilisation at its constituent machines. Two challenges present themselves
towards these objectives. First, collecting monitoring data entails substantial
communication overhead. This overhead can be prohibitively high, especially in
networks where bandwidth is limited. Second, forecasting models to predict
resource utilisation should be accurate and need to exhibit high inference
speed. Mission critical scheduling and resource allocation algorithms use these
predictions and rely on their immediate availability. To address the first
challenge, we present a communication-efficient data collection mechanism.
Resource utilisation data is collected at the individual machines in the system
and transmitted to a central controller in batches. Each batch is processed by
an adaptive data-reduction algorithm based on Fourier transforms and truncation
in the frequency domain. We show that the proposed mechanism leads to a
significant reduction in communication overhead while incurring only minimal
error and adhering to accuracy guarantees. To address the second challenge, we
propose a deep learning architecture using complex Gated Recurrent Units to
forecast resource utilisation. This architecture is directly integrated with
the above data collection mechanism to improve inference speed of our
forecasting model. Using two real-world datasets, we demonstrate the
effectiveness of our approach, both in terms of forecasting accuracy and
inference speed. Our approach resolves challenges encountered in resource
provisioning frameworks and can be applied to other forecasting problems.
Related papers
- Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting.
Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server.
We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z) - PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - Rethinking Resource Management in Edge Learning: A Joint Pre-training and Fine-tuning Design Paradigm [87.47506806135746]
In some applications, edge learning is experiencing a shift in focusing from conventional learning from scratch to new two-stage learning.
This paper considers the problem of joint communication and computation resource management in a two-stage edge learning system.
It is shown that the proposed joint resource management over the pre-training and fine-tuning stages well balances the system performance trade-off.
arXiv Detail & Related papers (2024-04-01T00:21:11Z) - Perceiver-based CDF Modeling for Time Series Forecasting [25.26713741799865]
We propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data.
Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction.
Experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods.
arXiv Detail & Related papers (2023-10-03T01:13:17Z) - Temporal Patience: Efficient Adaptive Deep Learning for Embedded Radar
Data Processing [4.359030177348051]
This paper presents novel techniques that leverage the temporal correlation present in streaming radar data to enhance the efficiency of Early Exit Neural Networks for Deep Learning inference on embedded devices.
Our results demonstrate that our techniques save up to 26% of operations per inference over a Single Exit Network and 12% over a confidence-based Early Exit version.
Such efficiency gains enable real-time radar data processing on resource-constrained platforms, allowing for new applications in the context of smart homes, Internet-of-Things, and human-computer interaction.
arXiv Detail & Related papers (2023-09-11T12:38:01Z) - Probabilistic Time Series Forecasting for Adaptive Monitoring in Edge
Computing Environments [0.06999740786886537]
In this paper, we propose a sampling-based and cloud-located approach for monitoring critical infrastructures.
We evaluate our prototype implementation for the monitoring pipeline on a publicly available streaming dataset.
arXiv Detail & Related papers (2022-11-24T17:35:14Z) - Federated Learning for Energy-limited Wireless Networks: A Partial Model
Aggregation Approach [79.59560136273917]
limited communication resources, bandwidth and energy, and data heterogeneity across devices are main bottlenecks for federated learning (FL)
We first devise a novel FL framework with partial model aggregation (PMA)
The proposed PMA-FL improves 2.72% and 11.6% accuracy on two typical heterogeneous datasets.
arXiv Detail & Related papers (2022-04-20T19:09:52Z) - Dynamic Network-Assisted D2D-Aided Coded Distributed Learning [59.29409589861241]
We propose a novel device-to-device (D2D)-aided coded federated learning method (D2D-CFL) for load balancing across devices.
We derive an optimal compression rate for achieving minimum processing time and establish its connection with the convergence time.
Our proposed method is beneficial for real-time collaborative applications, where the users continuously generate training data.
arXiv Detail & Related papers (2021-11-26T18:44:59Z) - LoRD-Net: Unfolded Deep Detection Network with Low-Resolution Receivers [104.01415343139901]
We propose a deep detector entitled LoRD-Net for recovering information symbols from one-bit measurements.
LoRD-Net has a task-based architecture dedicated to recovering the underlying signal of interest.
We evaluate the proposed receiver architecture for one-bit signal recovery in wireless communications.
arXiv Detail & Related papers (2021-02-05T04:26:05Z) - Scaling-up Distributed Processing of Data Streams for Machine Learning [10.581140430698103]
This paper reviews recently developed methods that focus on large-scale distributed optimization in the compute- and bandwidth-limited regime.
It focuses on methods that solve: (i) distributed convex problems, and (ii) distributed principal component analysis, which is a non problem with geometric structure that permits global convergence.
arXiv Detail & Related papers (2020-05-18T16:28:54Z) - Coordinates-based Resource Allocation Through Supervised Machine
Learning [14.014514995022182]
We propose a coordinates-based resource allocation scheme using supervised machine learning techniques.
The proposed scheme performs consistently well with realistic-system simulation, requiring only 4 s of training time.
arXiv Detail & Related papers (2020-05-13T18:33:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.