Data augmentation through multivariate scenario forecasting in Data
Centers using Generative Adversarial Networks
- URL: http://arxiv.org/abs/2201.06147v1
- Date: Wed, 12 Jan 2022 15:09:10 GMT
- Title: Data augmentation through multivariate scenario forecasting in Data
Centers using Generative Adversarial Networks
- Authors: Jaime P\'erez, Patricia Arroba and Jos\'e M. Moya
- Abstract summary: The main challenge in achieving a global energy efficiency strategy based on Artificial Intelligence is that we need massive amounts of data to feed the algorithms.
This paper proposes a time-series data augmentation methodology based on synthetic scenario forecasting within the Data Center.
Our research will help to optimize the energy consumed in Data Centers, although the proposed methodology can be employed in any similar time-series-like problem.
- Score: 0.18416014644193063
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Cloud paradigm is at a critical point in which the existing
energy-efficiency techniques are reaching a plateau, while the computing
resources demand at Data Center facilities continues to increase exponentially.
The main challenge in achieving a global energy efficiency strategy based on
Artificial Intelligence is that we need massive amounts of data to feed the
algorithms. Nowadays, any optimization strategy must begin with data. However,
companies with access to these large amounts of data decide not to share them
because it could compromise their security. This paper proposes a time-series
data augmentation methodology based on synthetic scenario forecasting within
the Data Center. For this purpose, we will implement a powerful generative
algorithm: Generative Adversarial Networks (GANs). The use of GANs will allow
us to handle multivariate data and data from different natures (e.g.,
categorical). On the other hand, adapting Data Centers' operational management
to the occurrence of sporadic anomalies is complicated due to the reduced
frequency of failures in the system. Therefore, we also propose a methodology
to increase the generated data variability by introducing on-demand anomalies.
We validated our approach using real data collected from an operating Data
Center, successfully obtaining forecasts of random scenarios with several hours
of prediction. Our research will help to optimize the energy consumed in Data
Centers, although the proposed methodology can be employed in any similar
time-series-like problem.
Related papers
- Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting.
Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server.
We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z) - A Distribution-Aware Flow-Matching for Generating Unstructured Data for Few-Shot Reinforcement Learning [1.0709300917082865]
We introduce a distribution-aware flow matching, designed to generate synthetic unstructured data tailored for few-shot reinforcement learning (RL) on embedded processors.
We apply feature weighting through Random Forests to prioritize critical data aspects, thereby improving the precision of the generated synthetic data.
Our method provides a stable convergence based on max Q-value while enhancing frame rate by 30% in the very beginning first timestamps.
arXiv Detail & Related papers (2024-09-21T15:50:59Z) - PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - Multi-Source Conformal Inference Under Distribution Shift [41.701790856201036]
We consider the problem of obtaining distribution-free prediction intervals for a target population, leveraging multiple potentially biased data sources.
We derive the efficient influence functions for the quantiles of unobserved outcomes in the target and source populations.
We propose a data-adaptive strategy to upweight informative data sources for efficiency gain and downweight non-informative data sources for bias reduction.
arXiv Detail & Related papers (2024-05-15T13:33:09Z) - Analysis and Optimization of Wireless Federated Learning with Data
Heterogeneity [72.85248553787538]
This paper focuses on performance analysis and optimization for wireless FL, considering data heterogeneity, combined with wireless resource allocation.
We formulate the loss function minimization problem, under constraints on long-term energy consumption and latency, and jointly optimize client scheduling, resource allocation, and the number of local training epochs (CRE)
Experiments on real-world datasets demonstrate that the proposed algorithm outperforms other benchmarks in terms of the learning accuracy and energy consumption.
arXiv Detail & Related papers (2023-08-04T04:18:01Z) - A Dataset Fusion Algorithm for Generalised Anomaly Detection in
Homogeneous Periodic Time Series Datasets [0.0]
"Dataset Fusion" is an algorithm for fusing periodic signals from multiple homogeneous datasets into a single dataset.
The proposed approach significantly outperforms conventional training approaches with an Average F1 score of 0.879.
Results show that using only 6.25% of the training data, translating to a 93.7% reduction in computational power, results in a mere 4.04% decrease in performance.
arXiv Detail & Related papers (2023-05-14T16:24:09Z) - Balancing Performance and Energy Consumption of Bagging Ensembles for
the Classification of Data Streams in Edge Computing [9.801387036837871]
Edge Computing (EC) has emerged as an enabling factor for developing technologies like the Internet of Things (IoT) and 5G networks.
This work investigates strategies for optimizing the performance and energy consumption of bagging ensembles to classify data streams.
arXiv Detail & Related papers (2022-01-17T04:12:18Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput.
Until today, no such learning-based algorithms have shown practical potential in this domain.
We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks.
We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z) - A Federated Data-Driven Evolutionary Algorithm [10.609815608017065]
Existing data-driven evolutionary optimization algorithms require that all data are centrally stored.
This paper proposes a federated data-driven evolutionary optimization framework that is able to perform data driven optimization when the data is distributed on multiple devices.
arXiv Detail & Related papers (2021-02-16T17:18:54Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.