DKDL-Net: A Lightweight Bearing Fault Detection Model via Decoupled Knowledge Distillation and Low-Rank Adaptation Fine-tuning
- URL: http://arxiv.org/abs/2406.06653v2
- Date: Thu, 20 Jun 2024 22:35:13 GMT
- Title: DKDL-Net: A Lightweight Bearing Fault Detection Model via Decoupled Knowledge Distillation and Low-Rank Adaptation Fine-tuning
- Authors: Ovanes Petrosian, Li Pengyi, He Yulong, Liu Jiarui, Sun Zhaoruikun, Fu Guofeng, Meng Liping,
- Abstract summary: This paper proposes a lightweight bearing fault diagnosis model DKDL-Net to solve these challenges.
The model is trained on the CWRU data set by decoupling knowledge distillation and low rank adaptive fine tuning.
Experiments show that DKDL-Net achieves 99.48% accuracy in computational complexity on the test set while maintaining model performance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rolling bearing fault detection has developed rapidly in the field of fault diagnosis technology, and it occupies a very important position in this field. Deep learning-based bearing fault diagnosis models have achieved significant success. At the same time, with the continuous improvement of new signal processing technologies such as Fourier transform, wavelet transform and empirical mode decomposition, the fault diagnosis technology of rolling bearings has also been greatly developed, and it can be said that it has entered a new research stage. However, most of the existing methods are limited to varying degrees in the industrial field. The main ones are fast feature extraction and computational complexity. The key to this paper is to propose a lightweight bearing fault diagnosis model DKDL-Net to solve these challenges. The model is trained on the CWRU data set by decoupling knowledge distillation and low rank adaptive fine tuning. Specifically, we built and trained a teacher model based on a 6-layer neural network with 69,626 trainable parameters, and on this basis, using decoupling knowledge distillation (DKD) and Low-Rank adaptive (LoRA) fine-tuning, we trained the student sag model DKDL-Net, which has only 6838 parameters. Experiments show that DKDL-Net achieves 99.48% accuracy in computational complexity on the test set while maintaining model performance, which is 0.58% higher than the state-of-the-art (SOTA) model, and our model has lower parameters. Our code is available at Github link: https://github.com/SPBU-LiPengyi/DKDL-Net.git.
Related papers
- TDANet: A Novel Temporal Denoise Convolutional Neural Network With Attention for Fault Diagnosis [0.5277756703318045]
This paper proposes the Temporal Denoise Convolutional Neural Network With Attention (TDANet) to improve fault diagnosis performance in noise environments.
The TDANet model transforms one-dimensional signals into two-dimensional tensors based on their periodic properties, employing multi-scale 2D convolution kernels to extract signal information both within and across periods.
Evaluation on two datasets, CWRU (single sensor) and Real aircraft sensor fault (multiple sensors), demonstrates that the TDANet model significantly outperforms existing deep learning approaches in terms of diagnostic accuracy under noisy environments.
arXiv Detail & Related papers (2024-03-29T02:54:41Z) - DDxT: Deep Generative Transformer Models for Differential Diagnosis [51.25660111437394]
We show that a generative approach trained with simpler supervised and self-supervised learning signals can achieve superior results on the current benchmark.
The proposed Transformer-based generative network, named DDxT, autoregressively produces a set of possible pathologies, i.e., DDx, and predicts the actual pathology using a neural network.
arXiv Detail & Related papers (2023-12-02T22:57:25Z) - EdgeFD: An Edge-Friendly Drift-Aware Fault Diagnosis System for
Industrial IoT [0.0]
We propose the Drift-Aware Weight Consolidation (DAWC) to mitigate the challenges posed by frequent data drift in the industrial Internet of Things (IIoT)
DAWC efficiently manages multiple data drift scenarios, minimizing the need for constant model fine-tuning on edge devices.
We have also developed a comprehensive diagnosis and visualization platform.
arXiv Detail & Related papers (2023-10-07T06:48:07Z) - Towards a robust and reliable deep learning approach for detection of
compact binary mergers in gravitational wave data [0.0]
We develop a deep learning model stage-wise and work towards improving its robustness and reliability.
We retrain the model in a novel framework involving a generative adversarial network (GAN)
Although absolute robustness is practically impossible to achieve, we demonstrate some fundamental improvements earned through such training.
arXiv Detail & Related papers (2023-06-20T18:00:05Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Q-DETR: An Efficient Low-Bit Quantized Detection Transformer [50.00784028552792]
We find that the bottlenecks of Q-DETR come from the query information distortion through our empirical analyses.
We formulate our DRD as a bi-level optimization problem, which can be derived by generalizing the information bottleneck (IB) principle to the learning of Q-DETR.
We introduce a new foreground-aware query matching scheme to effectively transfer the teacher information to distillation-desired features to minimize the conditional information entropy.
arXiv Detail & Related papers (2023-04-01T08:05:14Z) - ISimDL: Importance Sampling-Driven Acceleration of Fault Injection
Simulations for Evaluating the Robustness of Deep Learning [10.757663798809144]
We propose ISimDL, a novel methodology that employs neuron sensitivity to generate importance sampling-based fault-scenarios.
Our experiments show that the importance sampling provides up to 15x higher precision in selecting critical faults than the random uniform sampling, reaching such precision in less than 100 faults.
arXiv Detail & Related papers (2023-03-14T16:15:28Z) - DTAAD: Dual Tcn-Attention Networks for Anomaly Detection in Multivariate Time Series Data [0.0]
We propose an anomaly detection and diagnosis model, DTAAD, based on Transformer and Dual Temporal Convolutional Network (TCN)
scaling methods and feedback mechanisms are introduced to improve prediction accuracy and expand correlation differences.
Our experiments on seven public datasets validate that DTAAD exceeds the majority of currently advanced baseline methods in both detection and diagnostic performance.
arXiv Detail & Related papers (2023-02-17T06:59:45Z) - Directed Acyclic Graph Factorization Machines for CTR Prediction via
Knowledge Distillation [65.62538699160085]
We propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation.
KD-DAGFM achieves the best performance with less than 21.5% FLOPs of the state-of-the-art method on both online and offline experiments.
arXiv Detail & Related papers (2022-11-21T03:09:42Z) - Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine.
These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults.
We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z) - Contextual-Bandit Anomaly Detection for IoT Data in Distributed
Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay.
We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems.
We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.