Related papers: Towards a more realistic evaluation of machine learning models for bearing fault diagnosis

Towards a more realistic evaluation of machine learning models for bearing fault diagnosis

URL: http://arxiv.org/abs/2509.22267v1
Date: Fri, 26 Sep 2025 12:35:02 GMT
Title: Towards a more realistic evaluation of machine learning models for bearing fault diagnosis
Authors: João Paulo Vieira, Victor Afonso Bauler, Rodrigo Kobashikawa Rosa, Danilo Silva,
Abstract summary: This paper investigates the issue of data leakage in vibration-based bearing fault diagnosis and its impact on model evaluation.<n>We propose a leakage-free evaluation methodology centered on bearing-wise data partitioning, ensuring no overlap between the physical components used for training and testing.<n>We evaluate our methodology on three widely adopted datasets: CWRU, Paderborn University (PU), and University of Ottawa (UORED-VAF)
Score: 0.28873930745906956
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reliable detection of bearing faults is essential for maintaining the safety and operational efficiency of rotating machinery. While recent advances in machine learning (ML), particularly deep learning, have shown strong performance in controlled settings, many studies fail to generalize to real-world applications due to methodological flaws, most notably data leakage. This paper investigates the issue of data leakage in vibration-based bearing fault diagnosis and its impact on model evaluation. We demonstrate that common dataset partitioning strategies, such as segment-wise and condition-wise splits, introduce spurious correlations that inflate performance metrics. To address this, we propose a rigorous, leakage-free evaluation methodology centered on bearing-wise data partitioning, ensuring no overlap between the physical components used for training and testing. Additionally, we reformulate the classification task as a multi-label problem, enabling the detection of co-occurring fault types and the use of prevalence-independent metrics such as Macro AUROC. Beyond preventing leakage, we also examine the effect of dataset diversity on generalization, showing that the number of unique training bearings is a decisive factor for achieving robust performance. We evaluate our methodology on three widely adopted datasets: CWRU, Paderborn University (PU), and University of Ottawa (UORED-VAFCLS). This study highlights the importance of leakage-aware evaluation protocols and provides practical guidelines for dataset partitioning, model selection, and validation, fostering the development of more trustworthy ML systems for industrial fault diagnosis applications.

Related papers

From Physics to Machine Learning and Back: Part II - Learning and Observational Bias in PHM [52.64097278841485]
Review examines how incorporating learning and observational biases through physics-informed modeling and data strategies can guide models toward physically consistent and reliable predictions.<n>Fast adaptation methods including meta-learning and few-shot learning are reviewed alongside domain generalization techniques.
arXiv Detail & Related papers (2025-09-25T14:15:43Z)
Stress-Testing ML Pipelines with Adversarial Data Corruption [11.91482648083998]
Regulators now demand evidence that high-stakes systems can withstand realistic, interdependent errors.<n>We introduce SAVAGE, a framework that formally models data-quality issues through dependency graphs and flexible corruption templates.<n>Savanage employs a bi-level optimization approach to efficiently identify vulnerable data subpopulations and fine-tune corruption severity.
arXiv Detail & Related papers (2025-06-02T00:41:24Z)
The role of data partitioning on the performance of EEG-based deep learning models in supervised cross-subject analysis: a preliminary study [37.69303106863453]
Deep learning is advancing the analysis of electroencephalography (EEG) data by effectively discovering highly nonlinear patterns.<n>No comprehensive guidelines for proper data partitioning and cross-validation exist in the domain.<n>This paper thoroughly investigates the role of data partitioning and cross-validation in evaluating EEG deep learning models.
arXiv Detail & Related papers (2025-05-19T12:05:28Z)
A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy.<n>We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods.<n>By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z)
Active Foundational Models for Fault Diagnosis of Electrical Motors [0.5999777817331317]
Fault detection and diagnosis of electrical motors is of utmost importance in ensuring the safe and reliable operation of industrial systems. The existing data-driven deep learning approaches for machine fault diagnosis rely extensively on huge amounts of labeled samples. We propose a foundational model-based Active Learning framework that utilizes less amount of labeled samples.
arXiv Detail & Related papers (2023-11-27T03:25:12Z)
Causal Disentanglement Hidden Markov Model for Fault Diagnosis [55.90917958154425]
We propose a Causal Disentanglement Hidden Markov model (CDHM) to learn the causality in the bearing fault mechanism. Specifically, we make full use of the time-series data and progressively disentangle the vibration signal into fault-relevant and fault-irrelevant factors. To expand the scope of the application, we adopt unsupervised domain adaptation to transfer the learned disentangled representations to other working environments.
arXiv Detail & Related papers (2023-08-06T05:58:45Z)
Robustness and Generalization Performance of Deep Learning Models on Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise. We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z)
Probabilistic Bearing Fault Diagnosis Using Gaussian Process with Tailored Feature Extraction [10.064000794573756]
Rolling bearings are subject to various faults due to its long-time operation under harsh environment. Current deep learning methods perform the bearing fault diagnosis in the form of deterministic classification. We develop a probabilistic fault diagnosis framework that can account for the uncertainty effect in prediction.
arXiv Detail & Related papers (2021-09-19T18:34:29Z)
Estimating Structural Target Functions using Machine Learning and Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models. This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics. We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z)
Few-Shot Bearing Fault Diagnosis Based on Model-Agnostic Meta-Learning [3.8015092217142223]
We propose a few-shot learning framework for bearing fault diagnosis based on model-agnostic meta-learning (MAML) Case studies show that the proposed framework achieves an overall accuracy up to 25% higher than a Siamese network-based benchmark study.
arXiv Detail & Related papers (2020-07-25T04:03:18Z)
How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance. We formulate a quality measure for the data set, which we refer to as $rho$-gap. We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.