Related papers: Exploring Error Bits for Memory Failure Prediction: An In-Depth Correlative Study

Exploring Error Bits for Memory Failure Prediction: An In-Depth Correlative Study

URL: http://arxiv.org/abs/2312.02855v2
Date: Mon, 18 Dec 2023 15:30:26 GMT
Title: Exploring Error Bits for Memory Failure Prediction: An In-Depth Correlative Study
Authors: Qiao Yu, Wengui Zhang, Jorge Cardoso and Odej Kao
Abstract summary: We present a comprehensive study on the correlation between CEs and UEs. Our analysis reveals a strong correlation between large-temporal error bits and UE occurrence. Our approach effectively reduces the number of virtual machine interruptions caused by UEs by approximately 59%.
Score: 5.292618442300404
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In large-scale datacenters, memory failure is a common cause of server crashes, with Uncorrectable Errors (UEs) being a major indicator of Dual Inline Memory Module (DIMM) defects. Existing approaches primarily focus on predicting UEs using Correctable Errors (CEs), without fully considering the information provided by error bits. However, error bit patterns have a strong correlation with the occurrence of UEs. In this paper, we present a comprehensive study on the correlation between CEs and UEs, specifically emphasizing the importance of spatio-temporal error bit information. Our analysis reveals a strong correlation between spatio-temporal error bits and UE occurrence. Through evaluations using real-world datasets, we demonstrate that our approach significantly improves prediction performance by 15% in F1-score compared to the state-of-the-art algorithms. Overall, our approach effectively reduces the number of virtual machine interruptions caused by UEs by approximately 59%.

Related papers

A Plug-and-Play Learning-based IMU Bias Factor for Robust Visual-Inertial Odometry [15.724874429545824]
The bias of low-cost Inertial Measurement Units (IMU) is a critical factor affecting the performance of Visual-Inertial Odometry (VIO) We propose a novel plug-and-play framework featuring the Inertial Prior Network (IPNet), which is designed to accurately estimate IMU bias.
arXiv Detail & Related papers (2025-03-16T14:45:19Z)
Detrimental non-Markovian errors for surface code memory [0.5490714603843316]
We study the structure of non-Markovian correlated errors and their impact on surface code memory performance. Our analysis shows that while not all temporally correlated structures are detrimental, certain structures, particularly multi-time "streaky" correlations, can severely degrade logical error rate scaling.
arXiv Detail & Related papers (2024-10-31T09:52:21Z)
Is Difficulty Calibration All We Need? Towards More Practical Membership Inference Attacks [16.064233621959538]
We propose a query-efficient and computation-efficient MIA that directly textbfRe-levertextbfAges the original membershitextbfP scores to mtextbfItigate the errors in textbfDifficulty calibration.
arXiv Detail & Related papers (2024-08-31T11:59:42Z)
Regularized Contrastive Partial Multi-view Outlier Detection [76.77036536484114]
We propose a novel method named Regularized Contrastive Partial Multi-view Outlier Detection (RCPMOD) In this framework, we utilize contrastive learning to learn view-consistent information and distinguish outliers by the degree of consistency. Experimental results on four benchmark datasets demonstrate that our proposed approach could outperform state-of-the-art competitors.
arXiv Detail & Related papers (2024-08-02T14:34:27Z)
Investigating Memory Failure Prediction Across CPU Architectures [8.477622236186695]
We investigate the correlation between Correctable Errors (CEs) and Uncorrectable Errors (UEs) across different CPU architectures. Our analysis identifies unique patterns of memory failure associated with each processor platform. We conduct the memory failure prediction in different processors' platforms, achieving up to 15% improvements in F1-score compared to the existing algorithm.
arXiv Detail & Related papers (2024-06-08T05:10:23Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)
Tightening the Approximation Error of Adversarial Risk with Auto Loss Function Search [12.263913626161155]
A common type of evaluation is to approximate the adversarial risk of a model as a robustness indicator. We propose AutoLoss-AR, the first method for searching loss functions for tightening the error. The results demonstrate the effectiveness of the proposed methods.
arXiv Detail & Related papers (2021-11-09T11:47:43Z)
Discriminative-Generative Dual Memory Video Anomaly Detection [81.09977516403411]
Recently, people tried to use a few anomalies for video anomaly detection (VAD) instead of only normal data during the training process. We propose a DiscRiminative-gEnerative duAl Memory (DREAM) anomaly detection model to take advantage of a few anomalies and solve data imbalance.
arXiv Detail & Related papers (2021-04-29T15:49:01Z)
Collaborative Boundary-aware Context Encoding Networks for Error Map Prediction [65.44752447868626]
We propose collaborative boundaryaware context encoding networks called AEP-Net for error prediction task. Specifically, we propose a collaborative feature transformation branch for better feature fusion between images and masks, and precise localization of error regions. The AEP-Net achieves an average DSC of 0.8358, 0.8164 for error prediction task, and shows a high Pearson correlation coefficient of 0.9873.
arXiv Detail & Related papers (2020-06-25T12:42:01Z)
An Investigation of Why Overparameterization Exacerbates Spurious Correlations [98.3066727301239]
We identify two key properties of the training data that drive this behavior. We show how the inductive bias of models towards "memorizing" fewer examples can cause over parameterization to hurt.
arXiv Detail & Related papers (2020-05-09T01:59:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.