Efficient Visual Fault Detection for Freight Train Braking System via
Heterogeneous Self Distillation in the Wild
- URL: http://arxiv.org/abs/2307.00701v1
- Date: Mon, 3 Jul 2023 01:27:39 GMT
- Title: Efficient Visual Fault Detection for Freight Train Braking System via
Heterogeneous Self Distillation in the Wild
- Authors: Yang Zhang, Huilin Pan, Yang Zhou, Mingying Li, Guodong Sun
- Abstract summary: This paper proposes a heterogeneous self-distillation framework to ensure detection accuracy and speed.
We employ a novel loss function that makes the network easily concentrate on values near the label to improve learning efficiency.
Our framework can achieve over 37 frames per second and maintain the highest accuracy in comparison with traditional distillation approaches.
- Score: 8.062167870951706
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Efficient visual fault detection of freight trains is a critical part of
ensuring the safe operation of railways under the restricted hardware
environment. Although deep learning-based approaches have excelled in object
detection, the efficiency of freight train fault detection is still
insufficient to apply in real-world engineering. This paper proposes a
heterogeneous self-distillation framework to ensure detection accuracy and
speed while satisfying low resource requirements. The privileged information in
the output feature knowledge can be transferred from the teacher to the student
model through distillation to boost performance. We first adopt a lightweight
backbone to extract features and generate a new heterogeneous knowledge neck.
Such neck models positional information and long-range dependencies among
channels through parallel encoding to optimize feature extraction capabilities.
Then, we utilize the general distribution to obtain more credible and accurate
bounding box estimates. Finally, we employ a novel loss function that makes the
network easily concentrate on values near the label to improve learning
efficiency. Experiments on four fault datasets reveal that our framework can
achieve over 37 frames per second and maintain the highest accuracy in
comparison with traditional distillation approaches. Moreover, compared to
state-of-the-art methods, our framework demonstrates more competitive
performance with lower memory usage and the smallest model size.
Related papers
- Spatial-wise Dynamic Distillation for MLP-like Efficient Visual Fault
Detection of Freight Trains [11.13191969085042]
We present a dynamic distillation framework based on multi-layer perceptron (MLP) for fault detection of freight trains.
We propose a dynamic teacher that can effectively eliminate the semantic discrepancy with the student model.
Our approach outperforms the current state-of-the-art detectors and achieves the highest accuracy with real-time detection at a lower computational cost.
arXiv Detail & Related papers (2023-12-10T09:18:24Z) - Learning Lightweight Object Detectors via Multi-Teacher Progressive
Distillation [56.053397775016755]
We propose a sequential approach to knowledge distillation that progressively transfers the knowledge of a set of teacher detectors to a given lightweight student.
To the best of our knowledge, we are the first to successfully distill knowledge from Transformer-based teacher detectors to convolution-based students.
arXiv Detail & Related papers (2023-08-17T17:17:08Z) - Self-Knowledge Distillation via Dropout [0.7883397954991659]
We propose a simple and effective self-knowledge distillation method using a dropout (SD-Dropout)
Our method does not require any additional trainable modules, does not rely on data, and requires only simple operations.
arXiv Detail & Related papers (2022-08-11T05:08:55Z) - Localization Distillation for Object Detection [134.12664548771534]
Previous knowledge distillation (KD) methods for object detection mostly focus on feature imitation instead of mimicking the classification logits.
We present a novel localization distillation (LD) method which can efficiently transfer the localization knowledge from the teacher to the student.
We show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years.
arXiv Detail & Related papers (2022-04-12T17:14:34Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - Squeezing Backbone Feature Distributions to the Max for Efficient
Few-Shot Learning [3.1153758106426603]
Few-shot classification is a challenging problem due to the uncertainty caused by using few labelled samples.
We propose a novel transfer-based method which aims at processing the feature vectors so that they become closer to Gaussian-like distributions.
In the case of transductive few-shot learning where unlabelled test samples are available during training, we also introduce an optimal-transport inspired algorithm to boost even further the achieved performance.
arXiv Detail & Related papers (2021-10-18T16:29:17Z) - Automatic Detection of Rail Components via A Deep Convolutional
Transformer Network [7.557470133155959]
We propose a deep convolutional transformer network based method to detect multi-class rail components including the rail, clip, and bolt.
Our proposed method simplifies the detection pipeline by eliminating the need of prior settings, such as anchor box, aspect ratio, default coordinates, and post-processing.
Results of a comprehensive computational study show that our proposed method outperforms a set of existing state-of-art approaches with large margins.
arXiv Detail & Related papers (2021-08-05T07:38:04Z) - A Unified Light Framework for Real-time Fault Detection of Freight Train
Images [16.721758280029302]
Real-time fault detection for freight trains plays a vital role in guaranteeing the security and optimal operation of railway transportation.
Despite the promising results for deep learning based approaches, the performance of these fault detectors on freight train images are far from satisfactory in both accuracy and efficiency.
This paper proposes a unified light framework to improve detection accuracy while supporting a real-time operation with a low resource requirement.
arXiv Detail & Related papers (2021-01-31T05:10:20Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Distilling Object Detectors with Task Adaptive Regularization [97.52935611385179]
Current state-of-the-art object detectors are at the expense of high computational costs and are hard to deploy to low-end devices.
Knowledge distillation, which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the promising solutions for model miniaturization.
arXiv Detail & Related papers (2020-06-23T15:58:22Z) - Circumventing Outliers of AutoAugment with Knowledge Distillation [102.25991455094832]
AutoAugment has been a powerful algorithm that improves the accuracy of many vision tasks.
This paper delves deep into the working mechanism, and reveals that AutoAugment may remove part of discriminative information from the training image.
To relieve the inaccuracy of supervision, we make use of knowledge distillation that refers to the output of a teacher model to guide network training.
arXiv Detail & Related papers (2020-03-25T11:51:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.