Related papers: Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm

Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm

URL: http://arxiv.org/abs/2410.12259v1
Date: Wed, 16 Oct 2024 05:58:08 GMT
Title: Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm
Authors: Guanming Huang, Aoran Shen, Yuxiang Hu, Junliang Du, Jiacheng Hu, Yingbin Liang,
Abstract summary: This paper explores the application of knowledge distillation technology in target detection tasks. By using YOLOv5l as the teacher network and a smaller YOLOv5s as the student network, we found that with the increase of distillation temperature, the student's detection accuracy gradually improved.
Score: 37.37311465537091
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper explores the application of knowledge distillation technology in target detection tasks, especially the impact of different distillation temperatures on the performance of student models. By using YOLOv5l as the teacher network and a smaller YOLOv5s as the student network, we found that with the increase of distillation temperature, the student's detection accuracy gradually improved, and finally achieved mAP50 and mAP50-95 indicators that were better than the original YOLOv5s model at a specific temperature. Experimental results show that appropriate knowledge distillation strategies can not only improve the accuracy of the model but also help improve the reliability and stability of the model in practical applications. This paper also records in detail the accuracy curve and loss function descent curve during the model training process and shows that the model converges to a stable state after 150 training cycles. These findings provide a theoretical basis and technical reference for further optimizing target detection algorithms.

Related papers

Exploring the Impact of Temperature Scaling in Softmax for Classification and Adversarial Robustness [8.934328206473456]
This study delves into the often-overlooked parameter within the softmax function, known as "temperature" Our empirical studies, adopting convolutional neural networks and transformers, reveal that moderate temperatures generally introduce better overall performance. For the first time, we discover a surprising benefit of elevated temperatures: enhanced model robustness against common corruption, natural perturbation, and non-targeted adversarial attacks like Projected Gradient Descent.
arXiv Detail & Related papers (2025-02-28T00:07:45Z)
Warmup-Distill: Bridge the Distribution Mismatch between Teacher and Student before Knowledge Distillation [84.38105530043741]
We propose Warmup-Distill, which aligns the distillation of the student to that of the teacher in advance of distillation. Experiments on the seven benchmarks demonstrate that Warmup-Distill could provide a warmup student more suitable for distillation.
arXiv Detail & Related papers (2025-02-17T12:58:12Z)
Temperature-Free Loss Function for Contrastive Learning [7.229820415732795]
We propose a novel method to deploy InfoNCE loss without temperature. Specifically, we replace temperature scaling with the inverse hyperbolic tangent function, resulting in a modified InfoNCE loss. The proposed method was validated on five benchmarks on contrastive learning, yielding satisfactory results without temperature tuning.
arXiv Detail & Related papers (2025-01-29T14:43:21Z)
Innovative Deep Learning Techniques for Obstacle Recognition: A Comparative Study of Modern Detection Algorithms [0.0]
This study explores a comprehensive approach to obstacle detection using advanced YOLO models, specifically YOLOv8, YOLOv7, YOLOv6, and YOLOv5. The findings demonstrate that YOLOv8 achieves the highest accuracy with improved precision-recall metrics.
arXiv Detail & Related papers (2024-10-14T02:28:03Z)
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training [58.20089993899729]
This paper proposes TempBalance, a straightforward yet effective layerwise learning rate method. We show that TempBalance significantly outperforms ordinary SGD and carefully-tuned spectral norm regularization. We also show that TempBalance outperforms a number of state-of-the-art metrics and schedulers.
arXiv Detail & Related papers (2023-12-01T05:38:17Z)
Comparative Analysis of Epileptic Seizure Prediction: Exploring Diverse Pre-Processing Techniques and Machine Learning Models [0.0]
We present a comparative analysis of five machine learning models for the prediction of epileptic seizures using EEG data. The results of our analysis demonstrate the performance of each model in terms of accuracy. The ET model exhibited the best performance with an accuracy of 99.29%.
arXiv Detail & Related papers (2023-08-06T08:50:08Z)
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images. Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few. We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)
Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval [49.01637233471453]
Previous Knowledge Distillation based efficient image retrieval methods employs a lightweight network as the student model for fast inference. We propose a Capacity Dynamic Distillation framework, which constructs a student model with editable representation capacity. Our method has superior inference speed and accuracy, e.g., on the VeRi-776 dataset, given the ResNet101 as a teacher.
arXiv Detail & Related papers (2023-03-16T11:09:22Z)
Channel Pruned YOLOv5-based Deep Learning Approach for Rapid and Accurate Outdoor Obstacles Detection [6.703770367794502]
One-stage algorithm have been widely used in target detection systems that need to be trained with massive data. Due to their convolutional structure, they need more computing power and greater memory consumption. We apply pruning strategy to target detection networks to reduce the number of parameters and the size of model.
arXiv Detail & Related papers (2022-04-27T21:06:04Z)
LTD: Low Temperature Distillation for Robust Adversarial Training [1.3300217947936062]
Adversarial training has been widely used to enhance the robustness of neural network models against adversarial attacks. Despite the popularity of neural network models, a significant gap exists between the natural and robust accuracy of these models. We propose a novel method called Low Temperature Distillation (LTD) that generates soft labels using the modified knowledge distillation framework.
arXiv Detail & Related papers (2021-11-03T16:26:00Z)
Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones [40.33419553042038]
We propose to improve existing baseline networks via knowledge distillation from off-the-shelf pre-trained big powerful models. Our solution performs distillation by only driving prediction of the student model consistent with that of the teacher model. We empirically find that such simple distillation settings perform extremely effective, for example, the top-1 accuracy on ImageNet-1k validation set of MobileNetV3-large and ResNet50-D can be significantly improved.
arXiv Detail & Related papers (2021-03-10T09:32:44Z)
Autoregressive Knowledge Distillation through Imitation Learning [70.12862707908769]
We develop a compression technique for autoregressive models driven by an imitation learning perspective on knowledge distillation. Our method consistently outperforms other distillation algorithms, such as sequence-level knowledge distillation. Student models trained with our method attain 1.4 to 4.8 BLEU/ROUGE points higher than those trained from scratch, while increasing inference speed by up to 14 times in comparison to the teacher model.
arXiv Detail & Related papers (2020-09-15T17:43:02Z)
Distilling Object Detectors with Task Adaptive Regularization [97.52935611385179]
Current state-of-the-art object detectors are at the expense of high computational costs and are hard to deploy to low-end devices. Knowledge distillation, which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the promising solutions for model miniaturization.
arXiv Detail & Related papers (2020-06-23T15:58:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.