Related papers: SCLIFD:Supervised Contrastive Knowledge Distillation for Incremental Fault Diagnosis under Limited Fault Data

SCLIFD:Supervised Contrastive Knowledge Distillation for Incremental Fault Diagnosis under Limited Fault Data

URL: http://arxiv.org/abs/2302.05929v1
Date: Sun, 12 Feb 2023 14:50:12 GMT
Title: SCLIFD:Supervised Contrastive Knowledge Distillation for Incremental Fault Diagnosis under Limited Fault Data
Authors: Peng Peng, Hanrong Zhang, Mengxuan Li, Gongzhuang Peng, Hongwei Wang, Weiming Shen
Abstract summary: It is difficult to extract discriminative features from limited fault data. A well-trained model must be retrained from scratch to classify the samples from new classes. The model decision is biased toward the new classes due to the class imbalance.
Score: 8.354404360859263
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Intelligent fault diagnosis has made extraordinary advancements currently. Nonetheless, few works tackle class-incremental learning for fault diagnosis under limited fault data, i.e., imbalanced and long-tailed fault diagnosis, which brings about various notable challenges. Initially, it is difficult to extract discriminative features from limited fault data. Moreover, a well-trained model must be retrained from scratch to classify the samples from new classes, thus causing a high computational burden and time consumption. Furthermore, the model may suffer from catastrophic forgetting when trained incrementally. Finally, the model decision is biased toward the new classes due to the class imbalance. The problems can consequently lead to performance degradation of fault diagnosis models. Accordingly, we introduce a supervised contrastive knowledge distillation for incremental fault diagnosis under limited fault data (SCLIFD) framework to address these issues, which extends the classical incremental classifier and representation learning (iCaRL) framework from three perspectives. Primarily, we adopt supervised contrastive knowledge distillation (KD) to enhance its representation learning capability under limited fault data. Moreover, we propose a novel prioritized exemplar selection method adaptive herding (AdaHerding) to restrict the increase of the computational burden, which is also combined with KD to alleviate catastrophic forgetting. Additionally, we adopt the cosine classifier to mitigate the adverse impact of class imbalance. We conduct extensive experiments on simulated and real-world industrial processes under different imbalance ratios. Experimental results show that our SCLIFD outperforms the existing methods by a large margin.

Related papers

Long-tailed Medical Diagnosis with Relation-aware Representation Learning and Iterative Classifier Calibration [14.556686415877602]
We propose a new Long-tailed Medical Diagnosis (LMD) framework for balanced medical image classification on long-tailed datasets. Our framework significantly surpasses state-of-the-art approaches.
arXiv Detail & Related papers (2025-02-05T14:57:23Z)
Class Incremental Fault Diagnosis under Limited Fault Data via Supervised Contrastive Knowledge Distillation [9.560742599396411]
Class-incremental fault diagnosis requires a model to adapt to new fault classes while retaining previous knowledge. We introduce a Supervised Contrastive knowledge distiLlation for class Incremental Fault Diagnosis framework.
arXiv Detail & Related papers (2025-01-16T13:20:29Z)
Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA) Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%. DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z)
SRTFD: Scalable Real-Time Fault Diagnosis through Online Continual Learning [8.016378373626084]
Modern industrial environments demand FD methods that can handle new fault types, dynamic conditions, large-scale data, and provide real-time responses with minimal prior information. We propose SRTFD, a scalable real-time fault diagnosis framework that enhances online continual learning (OCL) with three critical methods. Experiments on a real-world dataset and two public simulated datasets demonstrate SRTFD's effectiveness and potential for providing advanced, scalable, and precise fault diagnosis in modern industrial systems.
arXiv Detail & Related papers (2024-08-11T03:26:22Z)
Error Detection and Constraint Recovery in Hierarchical Multi-Label Classification without Prior Knowledge [2.0007789979629784]
We present an approach based on Error Detection Rules (EDR) that allow for learning explainable rules about the failure modes of machine learning models. We show that our approach is effective in detecting machine learning errors and recovering constraints, is noise tolerant, and can function as a source of knowledge for neurosymbolic models on multiple datasets.
arXiv Detail & Related papers (2024-07-21T15:12:19Z)
Multi-Granularity Semantic Revision for Large Language Model Distillation [66.03746866578274]
We propose a multi-granularity semantic revision method for LLM distillation. At the sequence level, we propose a sequence correction and re-generation strategy. At the token level, we design a distribution adaptive clipping Kullback-Leibler loss as the distillation objective function. At the span level, we leverage the span priors of a sequence to compute the probability correlations within spans, and constrain the teacher and student's probability correlations to be consistent.
arXiv Detail & Related papers (2024-07-14T03:51:49Z)
Active Foundational Models for Fault Diagnosis of Electrical Motors [0.5999777817331317]
Fault detection and diagnosis of electrical motors is of utmost importance in ensuring the safe and reliable operation of industrial systems. The existing data-driven deep learning approaches for machine fault diagnosis rely extensively on huge amounts of labeled samples. We propose a foundational model-based Active Learning framework that utilizes less amount of labeled samples.
arXiv Detail & Related papers (2023-11-27T03:25:12Z)
Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL [86.0987896274354]
We first identify a fundamental pattern, self-excitation, as the primary cause of Q-value estimation divergence in offline RL. We then propose a novel Self-Excite Eigenvalue Measure (SEEM) metric to measure the evolving property of Q-network at training. For the first time, our theory can reliably decide whether the training will diverge at an early stage.
arXiv Detail & Related papers (2023-10-06T17:57:44Z)
Causal Disentanglement Hidden Markov Model for Fault Diagnosis [55.90917958154425]
We propose a Causal Disentanglement Hidden Markov model (CDHM) to learn the causality in the bearing fault mechanism. Specifically, we make full use of the time-series data and progressively disentangle the vibration signal into fault-relevant and fault-irrelevant factors. To expand the scope of the application, we adopt unsupervised domain adaptation to transfer the learned disentangled representations to other working environments.
arXiv Detail & Related papers (2023-08-06T05:58:45Z)
Semantic Latent Space Regression of Diffusion Autoencoders for Vertebral Fracture Grading [72.45699658852304]
This paper proposes a novel approach to train a generative Diffusion Autoencoder model as an unsupervised feature extractor. We model fracture grading as a continuous regression, which is more reflective of the smooth progression of fractures. Importantly, the generative nature of our method allows us to visualize different grades of a given vertebra, providing interpretability and insight into the features that contribute to automated grading.
arXiv Detail & Related papers (2023-03-21T17:16:01Z)
A New Knowledge Distillation Network for Incremental Few-Shot Surface Defect Detection [20.712532953953808]
This paper proposes a new knowledge distillation network, called Dual Knowledge Align Network (DKAN) The proposed DKAN method follows a pretraining-finetuning transfer learning paradigm and a knowledge distillation framework is designed for fine-tuning. Experiments have been conducted on the incremental Few-shot NEU-DET dataset and results show that DKAN outperforms other methods on various few-shot scenes.
arXiv Detail & Related papers (2022-09-01T15:08:44Z)
MixKD: Towards Efficient Distillation of Large-scale Language Models [129.73786264834894]
We propose MixKD, a data-agnostic distillation framework, to endow the resulting model with stronger generalization ability. We prove from a theoretical perspective that under reasonable conditions MixKD gives rise to a smaller gap between the error and the empirical error. Experiments under a limited-data setting and ablation studies further demonstrate the advantages of the proposed approach.
arXiv Detail & Related papers (2020-11-01T18:47:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.