A multi-stage semi-supervised improved deep embedded clustering
(MS-SSIDEC) method for bearing fault diagnosis under the situation of
insufficient labeled samples
- URL: http://arxiv.org/abs/2109.13521v1
- Date: Tue, 28 Sep 2021 06:49:40 GMT
- Title: A multi-stage semi-supervised improved deep embedded clustering
(MS-SSIDEC) method for bearing fault diagnosis under the situation of
insufficient labeled samples
- Authors: Tongda Sun, Gang Yu
- Abstract summary: It costs a lot of labor and time to label data in actual industrial processes, which challenges the application of intelligent fault diagnosis methods.
To solve this problem, a multi-stage semi-supervised improved deep embedded clustering (MS-SSIDEC) method is proposed.
This method includes three stages: pre-training, deep clustering and enhanced supervised learning.
- Score: 20.952315351460527
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intelligent data-driven fault diagnosis methods have been widely applied, but
most of these methods need a large number of high-quality labeled samples. It
costs a lot of labor and time to label data in actual industrial processes,
which challenges the application of intelligent fault diagnosis methods. To
solve this problem, a multi-stage semi-supervised improved deep embedded
clustering (MS-SSIDEC) method is proposed for the bearing fault diagnosis under
the insufficient labeled samples situation. This method includes three stages:
pre-training, deep clustering and enhanced supervised learning. In the first
stage, a skip-connection based convolutional auto-encoder (SCCAE) is proposed
and pre-trained to automatically learn low-dimensional representations. In the
second stage, a semi-supervised improved deep embedded clustering (SSIDEC)
model that integrates the pre-trained auto-encoder with a clustering layer is
proposed for deep clustering. Additionally, virtual adversarial training (VAT)
is introduced as a regularization term to overcome the overfitting in the
model's training. In the third stage, high-quality clustering results obtained
in the second stage are assigned to unlabeled samples as pseudo labels. The
labeled dataset is augmented by those pseudo-labeled samples and used to train
a bearing fault discriminative model. The effectiveness of the method is
evaluated on the Case Western Reserve University (CWRU) bearing dataset. The
results show that the method can not only satisfy the semi-supervised learning
under a small number of labeled samples, but also solve the problem of
unsupervised learning, and has achieved better results than traditional
diagnosis methods. This method provides a new research idea for fault diagnosis
with limited labeled samples by effectively using unsupervised data.
Related papers
- Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - LayerMatch: Do Pseudo-labels Benefit All Layers? [77.59625180366115]
Semi-supervised learning offers a promising solution to mitigate the dependency of labeled data.
We develop two layer-specific pseudo-label strategies, termed Grad-ReLU and Avg-Clustering.
Our approach consistently demonstrates exceptional performance on standard semi-supervised learning benchmarks.
arXiv Detail & Related papers (2024-06-20T11:25:50Z) - Active Foundational Models for Fault Diagnosis of Electrical Motors [0.5999777817331317]
Fault detection and diagnosis of electrical motors is of utmost importance in ensuring the safe and reliable operation of industrial systems.
The existing data-driven deep learning approaches for machine fault diagnosis rely extensively on huge amounts of labeled samples.
We propose a foundational model-based Active Learning framework that utilizes less amount of labeled samples.
arXiv Detail & Related papers (2023-11-27T03:25:12Z) - Semi-supervised binary classification with latent distance learning [0.0]
We propose a new learning representation to solve the binary classification problem using a few labels with a random k-pair cross-distance learning mechanism.
With few labels and without any data augmentation techniques, the proposed method outperformed state-of-the-art semi-supervised and self-supervised learning methods.
arXiv Detail & Related papers (2022-11-28T09:05:26Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Hierarchical Semi-Supervised Contrastive Learning for
Contamination-Resistant Anomaly Detection [81.07346419422605]
Anomaly detection aims at identifying deviant samples from the normal data distribution.
Contrastive learning has provided a successful way to sample representation that enables effective discrimination on anomalies.
We propose a novel hierarchical semi-supervised contrastive learning framework, for contamination-resistant anomaly detection.
arXiv Detail & Related papers (2022-07-24T18:49:26Z) - Semi-supervised Long-tailed Recognition using Alternate Sampling [95.93760490301395]
Main challenges in long-tailed recognition come from the imbalanced data distribution and sample scarcity in its tail classes.
We propose a new recognition setting, namely semi-supervised long-tailed recognition.
We demonstrate significant accuracy improvements over other competitive methods on two datasets.
arXiv Detail & Related papers (2021-05-01T00:43:38Z) - Deep Learning in current Neuroimaging: a multivariate approach with
power and type I error control but arguable generalization ability [0.158310730488265]
A non-parametric framework is proposed that estimates the statistical significance of classifications using deep learning architectures.
A label permutation test is proposed in both studies using cross-validation (CV) and resubstitution with upper bound correction (RUB) as validation methods.
We found in the permutation test that CV and RUB methods offer a false positive rate close to the significance level and an acceptable statistical power.
arXiv Detail & Related papers (2021-03-30T21:15:39Z) - Semi-supervised and Unsupervised Methods for Heart Sounds Classification
in Restricted Data Environments [4.712158833534046]
This study uses various supervised, semi-supervised and unsupervised approaches on the PhysioNet/CinC 2016 Challenge dataset.
A GAN based semi-supervised method is proposed, which allows the usage of unlabelled data samples to boost the learning of data distribution.
In particular, the unsupervised feature extraction using 1D CNN Autoencoder coupled with one-class SVM obtains good performance without any data labelling.
arXiv Detail & Related papers (2020-06-04T02:07:35Z) - SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier
Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples.
We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.