Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline Parallelism
- URL: http://arxiv.org/abs/2411.02086v1
- Date: Mon, 04 Nov 2024 13:49:06 GMT
- Title: Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline Parallelism
- Authors: Fan Wu, Muhammad Bilal, Haolong Xiang, Heng Wang, Jinjun Yu, Xiaolong Xu,
- Abstract summary: An edge-cloud collaborative early-warning system is proposed to enable real-time and downtime-tolerant fault diagnosis of RTMs.
Our ensemble-based fault diagnosis model achieves a remarkable 97.4% accuracy on a real-world dataset collected by Nanjing Metro in Jiangsu Province, China.
- Score: 9.544236630555627
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Railway Turnout Machines (RTMs) are mission-critical components of the railway transportation infrastructure, responsible for directing trains onto desired tracks. For safety assurance applications, especially in early-warning scenarios, RTM faults are expected to be detected as early as possible on a continuous 7x24 basis. However, limited emphasis has been placed on distributed model inference frameworks that can meet the inference latency and reliability requirements of such mission critical fault diagnosis systems. In this paper, an edge-cloud collaborative early-warning system is proposed to enable real-time and downtime-tolerant fault diagnosis of RTMs, providing a new paradigm for the deployment of models in safety-critical scenarios. Firstly, a modular fault diagnosis model is designed specifically for distributed deployment, which utilizes a hierarchical architecture consisting of the prior knowledge module, subordinate classifiers, and a fusion layer for enhanced accuracy and parallelism. Then, a cloud-edge collaborative framework leveraging pipeline parallelism, namely CEC-PA, is developed to minimize the overhead resulting from distributed task execution and context exchange by strategically partitioning and offloading model components across cloud and edge. Additionally, an election consensus mechanism is implemented within CEC-PA to ensure system robustness during coordinator node downtime. Comparative experiments and ablation studies are conducted to validate the effectiveness of the proposed distributed fault diagnosis approach. Our ensemble-based fault diagnosis model achieves a remarkable 97.4% accuracy on a real-world dataset collected by Nanjing Metro in Jiangsu Province, China. Meanwhile, CEC-PA demonstrates superior recovery proficiency during node disruptions and speed-up ranging from 1.98x to 7.93x in total inference time compared to its counterparts.
Related papers
- Elucidated Rolling Diffusion Models for Probabilistic Weather Forecasting [52.6508222408558]
We introduce Elucidated Rolling Diffusion Models (ERDM)<n>ERDM is the first framework to unify a rolling forecast structure with the principled, performant design of Elucidated Diffusion Models (EDM)<n>On 2D Navier-Stokes simulations and ERA5 global weather forecasting at 1.5circ resolution, ERDM consistently outperforms key diffusion-based baselines.
arXiv Detail & Related papers (2025-06-24T21:44:31Z) - Efficient Beam Selection for ISAC in Cell-Free Massive MIMO via Digital Twin-Assisted Deep Reinforcement Learning [37.540612510652174]
We derive the distribution of joint target detection probabilities across multiple receiving APs under false alarm rate constraints.<n>We then formulate the beam selection procedure as a Markov decision process (MDP)<n>To eliminate the high costs and associated risks of real-time agent-environment interactions, we propose a novel digital twin (DT)-assisted offline DRL approach.
arXiv Detail & Related papers (2025-06-23T12:17:57Z) - AlignRAG: Leveraging Critique Learning for Evidence-Sensitive Retrieval-Augmented Reasoning [61.28113271728859]
RAG has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>Standard RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>In this work, we reinterpret RAG as Retrieval-Augmented Reasoning and identify a central but underexplored problem: textitReasoning Misalignment.
arXiv Detail & Related papers (2025-04-21T04:56:47Z) - Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis [55.13545823385091]
Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents.
In real-world applications, each agent may experience slightly different transition dynamics, leading to inherent model mismatches.
We show that even moderate levels of information sharing can significantly mitigate environment-specific errors.
arXiv Detail & Related papers (2025-03-21T18:06:28Z) - An Optimal Cascade Feature-Level Spatiotemporal Fusion Strategy for Anomaly Detection in CAN Bus [2.8151714475955263]
Intelligent transportation systems (ITS) play a pivotal role in modern infrastructure but face security risks due to the broadcast-based nature of the in-vehicle Controller Area Network (CAN) buses.<n>The proposed framework detects all attack types with 100% accuracy, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2025-01-31T00:36:08Z) - Spatial-Temporal Bearing Fault Detection Using Graph Attention Networks and LSTM [0.7864304771129751]
This paper introduces a novel method that combines Graph Attention Network (GAT) and Long Short-Term Memory (LSTM) networks.
This approach captures both spatial and temporal dependencies within sensor data, improving the accuracy of bearing fault detection.
arXiv Detail & Related papers (2024-10-15T12:55:57Z) - DA-Flow: Dual Attention Normalizing Flow for Skeleton-based Video Anomaly Detection [52.74152717667157]
We propose a lightweight module called Dual Attention Module (DAM) for capturing cross-dimension interaction relationships in-temporal skeletal data.
It employs the frame attention mechanism to identify the most significant frames and the skeleton attention mechanism to capture broader relationships across fixed partitions with minimal parameters and flops.
arXiv Detail & Related papers (2024-06-05T06:18:03Z) - A Sparse Bayesian Learning for Diagnosis of Nonstationary and Spatially
Correlated Faults with Application to Multistation Assembly Systems [3.4991031406102238]
This article proposes a novel fault diagnosis method: clustering spatially correlated sparse Bayesian learning (CSSBL)
The proposed method's efficacy is provided through numerical and real-world case studies utilizing an actual autobody assembly system.
The generalizability of the proposed method allows the technique to be applied in fault diagnosis in other domains, including communication and healthcare systems.
arXiv Detail & Related papers (2023-10-20T23:56:53Z) - Causal Disentanglement Hidden Markov Model for Fault Diagnosis [55.90917958154425]
We propose a Causal Disentanglement Hidden Markov model (CDHM) to learn the causality in the bearing fault mechanism.
Specifically, we make full use of the time-series data and progressively disentangle the vibration signal into fault-relevant and fault-irrelevant factors.
To expand the scope of the application, we adopt unsupervised domain adaptation to transfer the learned disentangled representations to other working environments.
arXiv Detail & Related papers (2023-08-06T05:58:45Z) - Learned Risk Metric Maps for Kinodynamic Systems [54.49871675894546]
We present Learned Risk Metric Maps for real-time estimation of coherent risk metrics of high dimensional dynamical systems.
LRMM models are simple to design and train, requiring only procedural generation of obstacle sets, state and control sampling, and supervised training of a function approximator.
arXiv Detail & Related papers (2023-02-28T17:51:43Z) - System Resilience through Health Monitoring and Reconfiguration [56.448036299746285]
We demonstrate an end-to-end framework to improve the resilience of man-made systems to unforeseen events.
The framework is based on a physics-based digital twin model and three modules tasked with real-time fault diagnosis, prognostics and reconfiguration.
arXiv Detail & Related papers (2022-08-30T20:16:17Z) - Asynchronous Decentralized Federated Learning for Collaborative Fault
Diagnosis of PV Stations [0.0]
A novel asynchronous decentralized federated learning (ADFL) framework is proposed to train a collaborative fault diagnosis model.
The global model is aggregated distributedly to avoid central node failure.
Both the experiments and numerical simulations are carried out to verify the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-02-28T08:26:48Z) - PreGAN: Preemptive Migration Prediction Network for Proactive
Fault-Tolerant Edge Computing [12.215537834860699]
We propose PreGAN, a composite AI model using a Generative Adrial Network (GAN) to predict preemptive migration decisions for proactive fault-tolerance.
PreGAN can outperform state-of-the-art baseline methods in fault-detection, diagnosis and classification, thus achieving high quality of service.
arXiv Detail & Related papers (2021-12-04T09:40:50Z) - A Novel Deep Parallel Time-series Relation Network for Fault Diagnosis [7.127292365993219]
A fault diagnosis model named deep parallel time-series relation network(textitDPTRN) has been proposed in this paper.
Our model outperforms other methods on both TE and KDD-CUP99 datasets.
arXiv Detail & Related papers (2021-12-03T08:24:59Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - Higher Performance Visual Tracking with Dual-Modal Localization [106.91097443275035]
Visual Object Tracking (VOT) has synchronous needs for both robustness and accuracy.
We propose a dual-modal framework for target localization, consisting of robust localization suppressingors via ONR and the accurate localization attending to the target center precisely via OFC.
arXiv Detail & Related papers (2021-03-18T08:47:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.