Related papers: Machine Learning for Consistency Violation Faults Analysis

Machine Learning for Consistency Violation Faults Analysis

URL: http://arxiv.org/abs/2506.02002v1
Date: Tue, 20 May 2025 22:11:43 GMT
Title: Machine Learning for Consistency Violation Faults Analysis
Authors: Kamal Giri, Amit Garu,
Abstract summary: This study presents a machine learning-based approach for analyzing the impact of consistency violation faults (cvfs) on distributed systems.<n>By computing program transition ranks and their corresponding effects, the proposed method quantifies the influence of cvfs on system behavior.<n> Experimental results demonstrate promising performance, with a test loss of 4.39 and a mean absolute error of 1.5.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Distributed systems frequently encounter consistency violation faults (cvfs), where nodes operate on outdated or inaccurate data, adversely affecting convergence and overall system performance. This study presents a machine learning-based approach for analyzing the impact of CVFs, using Dijkstra's Token Ring problem as a case study. By computing program transition ranks and their corresponding effects, the proposed method quantifies the influence of cvfs on system behavior. To address the state space explosion encountered in larger graphs, two models are implemented: a Feedforward Neural Network (FNN) and a distributed neural network leveraging TensorFlow's \texttt{tf.distribute} API. These models are trained on datasets generated from smaller graphs (3 to 10 nodes) to predict parameters essential for determining rank effects. Experimental results demonstrate promising performance, with a test loss of 4.39 and a mean absolute error of 1.5. Although distributed training on a CPU did not yield significant speed improvements over a single-device setup, the findings suggest that scalability could be enhanced through the use of advanced hardware accelerators such as GPUs or TPUs.

Related papers

Time-Series Learning for Proactive Fault Prediction in Distributed Systems with Deep Neural Structures [5.572536027964037]
This paper addresses the challenges of fault prediction and delayed response in distributed systems.<n>We use a Gated Recurrent Unit to model the evolution of system states over time.<n>An attention mechanism is then applied to enhance key temporal segments, improving the model's ability to identify potential faults.
arXiv Detail & Related papers (2025-05-27T04:31:12Z)
DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment [57.62885438406724]
Graph neural networks are recognized for their strong performance across various applications. BP has limitations that challenge its biological plausibility and affect the efficiency, scalability and parallelism of training neural networks for graph-based tasks. We propose DFA-GNN, a novel forward learning framework tailored for GNNs with a case study of semi-supervised learning.
arXiv Detail & Related papers (2024-06-04T07:24:51Z)
Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp) We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings. For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error. In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z)
Deep Graph Stream SVDD: Anomaly Detection in Cyber-Physical Systems [17.373668215331737]
We propose a new approach called deep graph vector data description (SVDD) for anomaly detection. We first use a transformer to preserve both short and long temporal patterns monitoring data in temporal embeddings. We cluster these embeddings according to sensor type and utilize them to estimate the change in connectivity between various sensors to construct a new weighted graph.
arXiv Detail & Related papers (2023-02-24T22:14:39Z)
CUTS: Neural Causal Discovery from Irregular Time-Series Data [27.06531262632836]
Causal discovery from time-series data has been a central task in machine learning. We present CUTS, a neural Granger causal discovery algorithm to jointly impute unobserved data points and build causal graphs. Our approach constitutes a promising step towards applying causal discovery to real applications with non-ideal observations.
arXiv Detail & Related papers (2023-02-15T04:16:34Z)
Efficient Graph Neural Network Inference at Large Scale [54.89457550773165]
Graph neural networks (GNNs) have demonstrated excellent performance in a wide range of applications. Existing scalable GNNs leverage linear propagation to preprocess the features and accelerate the training and inference procedure. We propose a novel adaptive propagation order approach that generates the personalized propagation order for each node based on its topological information.
arXiv Detail & Related papers (2022-11-01T14:38:18Z)
Convolutional generative adversarial imputation networks for spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods. We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z)
Distributionally Robust Semi-Supervised Learning Over Graphs [68.29280230284712]
Semi-supervised learning (SSL) over graph-structured data emerges in many network science applications. To efficiently manage learning over graphs, variants of graph neural networks (GNNs) have been developed recently. Despite their success in practice, most of existing methods are unable to handle graphs with uncertain nodal attributes. Challenges also arise due to distributional uncertainties associated with data acquired by noisy measurements. A distributionally robust learning framework is developed, where the objective is to train models that exhibit quantifiable robustness against perturbations.
arXiv Detail & Related papers (2021-10-20T14:23:54Z)
Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining [58.10436813430554]
Mini-batch training of graph neural networks (GNNs) requires a lot of computation and data movement. We argue in favor of performing mini-batch training with neighborhood sampling in a distributed multi-GPU environment. We present a sequence of improvements to mitigate these bottlenecks, including a performance-engineered neighborhood sampler. We also conduct an empirical analysis that supports the use of sampling for inference, showing that test accuracies are not materially compromised.
arXiv Detail & Related papers (2021-10-16T02:41:35Z)
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch [75.69506249886622]
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments. In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network.
arXiv Detail & Related papers (2021-02-08T05:55:47Z)
Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks [0.17499351967216337]
We provide a machine learning-based method, PerfNetV2, which improves the accuracy of our previous work for modeling the neural network performance on a variety of GPU accelerators. Given an application, the proposed method can be used to predict the inference time and training time of the convolutional neural networks used in the application. Our case studies show that PerfNetV2 yields a mean absolute percentage error within 13.1% on LeNet, AlexNet, and VGG16 on NVIDIA GTX-1080Ti, while the error rate on a previous work published in ICBD 2018 could be as large as 200%.
arXiv Detail & Related papers (2020-12-01T01:42:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.