Transformer-Based Bearing Fault Detection using Temporal Decomposition Attention Mechanism
- URL: http://arxiv.org/abs/2412.11245v1
- Date: Sun, 15 Dec 2024 16:51:31 GMT
- Title: Transformer-Based Bearing Fault Detection using Temporal Decomposition Attention Mechanism
- Authors: Marzieh Mirzaeibonehkhater, Mohammad Ali Labbaf-Khaniki, Mohammad Manthouri,
- Abstract summary: Bearing fault detection is a critical task in predictive maintenance, where accurate and timely fault identification can prevent costly downtime and equipment damage.<n>Traditional attention mechanisms in Transformer neural networks often struggle to capture the complex temporal patterns in bearing vibration data, leading to suboptimal performance.<n>We propose a novel attention mechanism, Temporal Decomposition Attention (TDA), which combines temporal bias encoding with seasonal-trend decomposition to capture both long-term dependencies and periodic fluctuations in time series data.
- Score: 0.40964539027092917
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bearing fault detection is a critical task in predictive maintenance, where accurate and timely fault identification can prevent costly downtime and equipment damage. Traditional attention mechanisms in Transformer neural networks often struggle to capture the complex temporal patterns in bearing vibration data, leading to suboptimal performance. To address this limitation, we propose a novel attention mechanism, Temporal Decomposition Attention (TDA), which combines temporal bias encoding with seasonal-trend decomposition to capture both long-term dependencies and periodic fluctuations in time series data. Additionally, we incorporate the Hull Exponential Moving Average (HEMA) for feature extraction, enabling the model to effectively capture meaningful characteristics from the data while reducing noise. Our approach integrates TDA into the Transformer architecture, allowing the model to focus separately on the trend and seasonal components of the data. Experimental results on the Case Western Reserve University (CWRU) bearing fault detection dataset demonstrate that our approach outperforms traditional attention mechanisms and achieves state-of-the-art performance in terms of accuracy and interpretability. The HEMA-Transformer-TDA model achieves an accuracy of 98.1%, with exceptional precision, recall, and F1-scores, demonstrating its effectiveness in bearing fault detection and its potential for application in other time series tasks with seasonal patterns or trends.
Related papers
- MEMTS: Internalizing Domain Knowledge via Parameterized Memory for Retrieval-Free Domain Adaptation of Time Series Foundation Models [51.506429027626005]
Memory for Time Series (MEMTS) is a lightweight and plug-and-play method for retrieval-free domain adaptation in time series forecasting.<n>Key component of MEMTS is a Knowledge Persistence Module (KPM), which internalizes domain-specific temporal dynamics.<n>This paradigm shift enables MEMTS to achieve accurate domain adaptation with constant-time inference and near-zero latency.
arXiv Detail & Related papers (2026-02-14T14:00:06Z) - Real-Time Proactive Anomaly Detection via Forward and Backward Forecast Modeling [0.0]
We introduce two proactive anomaly detection frameworks: the Forward Forecasting Model (FFM) and the Backward Reconstruction Model (BRM)<n>FFM forecasts future sequences to anticipate disruptions, while BRM reconstructs recent history from future context to uncover early precursors.<n>Our models support both continuous and discrete multivariate features, enabling robust performance in real-world settings.
arXiv Detail & Related papers (2026-02-12T03:57:41Z) - Transformer-Based Indirect Structural Health Monitoring of Rail Infrastructure with Attention-Driven Detection and Localization of Transient Defects [1.1782896991259]
We introduce an incremental synthetic data benchmark designed to evaluate model robustness against progressively complex challenges.<n>We evaluate several established unsupervised models alongside our proposed Attention-Focused Transformer.<n>Our proposed model achieves accuracy comparable to the state-of-the-art solution while demonstrating better inference speed.
arXiv Detail & Related papers (2025-10-08T23:01:53Z) - Revisiting Multivariate Time Series Forecasting with Missing Values [65.30332997607141]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z) - WDformer: A Wavelet-based Differential Transformer Model for Time Series Forecasting [21.222605948133893]
Time series forecasting has various applications, such as meteorological rainfall prediction, traffic flow analysis, financial forecasting, and operational load monitoring.<n>Due to the sparsity of time series data, relying solely on time-domain or frequency-domain modeling limits the model's ability to fully leverage multi-domain information.<n>We proposed WDformer, a wavelet-based differential Transformer model, to conduct a multi-resolution analysis of time series data.
arXiv Detail & Related papers (2025-09-25T02:43:51Z) - Powerformer: A Transformer with Weighted Causal Attention for Time-series Forecasting [50.298817606660826]
We introduce Powerformer, a novel Transformer variant that replaces noncausal attention weights with causal weights that are reweighted according to a smooth heavy-tailed decay.
Our empirical results demonstrate that Powerformer achieves state-of-the-art accuracy on public time-series benchmarks.
Our analyses show that the model's locality bias is amplified during training, demonstrating an interplay between time-series data and power-law-based attention.
arXiv Detail & Related papers (2025-02-10T04:42:11Z) - Steam Turbine Anomaly Detection: An Unsupervised Learning Approach Using Enhanced Long Short-Term Memory Variational Autoencoder [3.004788114489393]
We propose an Enhanced Long Short-Term Memory Variational Autoencoder (ELSTMVAE-DAF-GMM) for precise unsupervised anomaly detection in unlabeled datasets.
ELSTMVAE, integrating LSTM with VAE, was used to project high-dimensional time-series data to a low-dimensional phase space.
The novel deep advanced features (DAF) hybridize latent embeddings and reconstruction discrepancies from the LSTMVAE model and provide a more comprehensive data representation.
arXiv Detail & Related papers (2024-11-16T10:11:36Z) - Spatial-Temporal Bearing Fault Detection Using Graph Attention Networks and LSTM [0.7864304771129751]
This paper introduces a novel method that combines Graph Attention Network (GAT) and Long Short-Term Memory (LSTM) networks.
This approach captures both spatial and temporal dependencies within sensor data, improving the accuracy of bearing fault detection.
arXiv Detail & Related papers (2024-10-15T12:55:57Z) - Detecting Anomalies in Dynamic Graphs via Memory enhanced Normality [39.476378833827184]
Anomaly detection in dynamic graphs presents a significant challenge due to the temporal evolution of graph structures and attributes.
We introduce a novel spatial- temporal memories-enhanced graph autoencoder (STRIPE)
STRIPE significantly outperforms existing methods with 5.8% improvement in AUC scores and 4.62X faster in training time.
arXiv Detail & Related papers (2024-03-14T02:26:10Z) - Attention as Robust Representation for Time Series Forecasting [23.292260325891032]
Time series forecasting is essential for many practical applications.
Transformers' key feature, the attention mechanism, dynamically fusing embeddings to enhance data representation, often relegating attention weights to a byproduct role.
Our approach elevates attention weights as the primary representation for time series, capitalizing on the temporal relationships among data points to improve forecasting accuracy.
arXiv Detail & Related papers (2024-02-08T03:00:50Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - Causal Disentanglement Hidden Markov Model for Fault Diagnosis [55.90917958154425]
We propose a Causal Disentanglement Hidden Markov model (CDHM) to learn the causality in the bearing fault mechanism.
Specifically, we make full use of the time-series data and progressively disentangle the vibration signal into fault-relevant and fault-irrelevant factors.
To expand the scope of the application, we adopt unsupervised domain adaptation to transfer the learned disentangled representations to other working environments.
arXiv Detail & Related papers (2023-08-06T05:58:45Z) - AnomalyBERT: Self-Supervised Transformer for Time Series Anomaly
Detection using Data Degradation Scheme [0.7216399430290167]
Anomaly detection task for time series, especially for unlabeled data, has been a challenging problem.
We address it by applying a suitable data degradation scheme to self-supervised model training.
Inspired by the self-attention mechanism, we design a Transformer-based architecture to recognize the temporal context.
arXiv Detail & Related papers (2023-05-08T05:42:24Z) - Novel features for the detection of bearing faults in railway vehicles [88.89591720652352]
We introduce Mel-Frequency Cepstral Coefficients (MFCCs) and features extracted from the Amplitude Modulation Spectrogram (AMS) as features for the detection of bearing faults.
arXiv Detail & Related papers (2023-04-14T10:09:50Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions [53.37679435230207]
We propose DeepVol, a model based on Dilated Causal Convolutions that uses high-frequency data to forecast day-ahead volatility.
Our empirical results suggest that the proposed deep learning-based approach effectively learns global features from high-frequency data.
arXiv Detail & Related papers (2022-09-23T16:13:47Z) - Remaining Useful Life Prediction Using Temporal Deep Degradation Network
for Complex Machinery with Attention-based Feature Extraction [17.831515307314802]
Degradation-related features extracted from the sensor streaming data with neural networks can dramatically improve the accuracy of the RUL prediction.
The Temporal deep degradation network (TDDN) model is proposed to make the RUL prediction with the degradation-related features given by the one-dimensional convolutional neural network (1D CNN)
The results show that the TDDN model can achieve the best RUL prediction accuracy in complex conditions compared to current machine learning models.
arXiv Detail & Related papers (2022-02-21T10:23:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.