Diffusion-based Time Series Data Imputation for Microsoft 365
- URL: http://arxiv.org/abs/2309.02564v1
- Date: Thu, 3 Aug 2023 10:25:17 GMT
- Title: Diffusion-based Time Series Data Imputation for Microsoft 365
- Authors: Fangkai Yang, Wenjie Yin, Lu Wang, Tianci Li, Pu Zhao, Bo Liu, Paul
Wang, Bo Qiao, Yudong Liu, M{\aa}rten Bj\"orkman, Saravan Rajmohan, Qingwei
Lin, Dongmei Zhang
- Abstract summary: We focus on enhancing data quality through data imputation by the proposed Diffusion+, a sample-efficient diffusion model.
Our experiments and application practice show that our model contributes to improving the performance of the downstream failure prediction task.
- Score: 35.16965409097466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reliability is extremely important for large-scale cloud systems like
Microsoft 365. Cloud failures such as disk failure, node failure, etc. threaten
service reliability, resulting in online service interruptions and economic
loss. Existing works focus on predicting cloud failures and proactively taking
action before failures happen. However, they suffer from poor data quality like
data missing in model training and prediction, which limits the performance. In
this paper, we focus on enhancing data quality through data imputation by the
proposed Diffusion+, a sample-efficient diffusion model, to impute the missing
data efficiently based on the observed data. Our experiments and application
practice show that our model contributes to improving the performance of the
downstream failure prediction task.
Related papers
- Self-attention-based Diffusion Model for Time-series Imputation in Partial Blackout Scenarios [23.160007389272575]
Missing values in time series data can harm machine learning performance and introduce bias.
Previous work has tackled the imputation of missing data in random, complete blackouts and forecasting scenarios.
We introduce a two-stage imputation process using self-attention and diffusion processes to model feature and temporal correlations.
arXiv Detail & Related papers (2025-03-03T16:58:15Z) - Multivariate Data Augmentation for Predictive Maintenance using Diffusion [35.286105732902065]
Predictive maintenance has been used to optimize system repairs in the industrial, medical, and financial domains.
There is a lack of fault data to train these models, due to organizations working to keep fault occurrences and down time to a minimum.
For newly installed systems, no fault data exists since they have yet to fail.
arXiv Detail & Related papers (2024-11-06T16:57:09Z) - Beyond Full Poisoning: Effective Availability Attacks with Partial Perturbation [8.225819874406238]
We propose a novel availability attack approach termed.
the Matching Attack (PMA)
PMA is the first availability attack capable of causing more than a 30% performance drop when only a portion of data can be perturbed.
Experimental results across four datasets demonstrate that PMA outperforms existing methods.
arXiv Detail & Related papers (2024-07-02T17:15:12Z) - Why does Prediction Accuracy Decrease over Time? Uncertain Positive
Learning for Cloud Failure Prediction [35.058991707881646]
We find that the prediction accuracy may decrease by about 9% after retraining the models.
Considering that the mitigation actions may result in uncertain positive instances since they cannot be verified after mitigation, which may introduce more noise while updating the prediction model.
To tackle this problem, we design an Uncertain Positive Learning Risk Estimator (Uptake) approach.
arXiv Detail & Related papers (2024-01-08T03:13:09Z) - GraphGuard: Detecting and Counteracting Training Data Misuse in Graph
Neural Networks [69.97213941893351]
The emergence of Graph Neural Networks (GNNs) in graph data analysis has raised critical concerns about data misuse during model training.
Existing methodologies address either data misuse detection or mitigation, and are primarily designed for local GNN models.
This paper introduces a pioneering approach called GraphGuard, to tackle these challenges.
arXiv Detail & Related papers (2023-12-13T02:59:37Z) - Towards Continually Learning Application Performance Models [1.2278517240988065]
Machine learning-based performance models are increasingly being used to build critical job scheduling and application optimization decisions.
Traditionally, these models assume that data distribution does not change as more samples are collected over time.
We develop continually learning performance models that account for the distribution drift, alleviate catastrophic forgetting, and improve generalizability.
arXiv Detail & Related papers (2023-10-25T20:48:46Z) - A Bayesian Generative Adversarial Network (GAN) to Generate Synthetic
Time-Series Data, Application in Combined Sewer Flow Prediction [3.3139597764446607]
In machine learning, generative models are a class of methods capable of learning data distribution to generate artificial data.
In this study, we developed a GAN model to generate synthetic time series to balance our limited recorded time series data.
The aim is to predict the flow using precipitation data and examine the impact of data augmentation using synthetic data in model performance.
arXiv Detail & Related papers (2023-01-31T16:12:26Z) - Striving for data-model efficiency: Identifying data externalities on
group performance [75.17591306911015]
Building trustworthy, effective, and responsible machine learning systems hinges on understanding how differences in training data and modeling decisions interact to impact predictive performance.
We focus on a particular type of data-model inefficiency, in which adding training data from some sources can actually lower performance evaluated on key sub-groups of the population.
Our results indicate that data-efficiency is a key component of both accurate and trustworthy machine learning.
arXiv Detail & Related papers (2022-11-11T16:48:27Z) - Robust Trajectory Prediction against Adversarial Attacks [84.10405251683713]
Trajectory prediction using deep neural networks (DNNs) is an essential component of autonomous driving systems.
These methods are vulnerable to adversarial attacks, leading to serious consequences such as collisions.
In this work, we identify two key ingredients to defend trajectory prediction models against adversarial attacks.
arXiv Detail & Related papers (2022-07-29T22:35:05Z) - DualCF: Efficient Model Extraction Attack from Counterfactual
Explanations [57.46134660974256]
Cloud service providers have launched Machine-Learning-as-a-Service platforms to allow users to access large-scale cloudbased models via APIs.
Such extra information inevitably causes the cloud models to be more vulnerable to extraction attacks.
We propose a novel simple yet efficient querying strategy to greatly enhance the querying efficiency to steal a classification model.
arXiv Detail & Related papers (2022-05-13T08:24:43Z) - Large-scale memory failure prediction using mcelog-based Data Mining and
Machine Learning [0.0]
In the data center, unexpected downtime caused by memory failures can lead to a decline in the stability of the server.
This paper compares and summarizes some commonly used skills and the improvement they can bring.
The single model we proposed won the top 15th in the 2nd Alibaba Cloud AIOps Competition.
arXiv Detail & Related papers (2021-04-24T11:38:05Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.