Improvement and Enhancement of YOLOv5 Small Target Recognition Based on
Multi-module Optimization
- URL: http://arxiv.org/abs/2310.01806v1
- Date: Tue, 3 Oct 2023 05:39:36 GMT
- Title: Improvement and Enhancement of YOLOv5 Small Target Recognition Based on
Multi-module Optimization
- Authors: Qingyang Li and Yuchen Li and Hongyi Duan and JiaLiang Kang and Jianan
Zhang and Xueqian Gan and Ruotong Xu
- Abstract summary: The performance of the model is successfully enhanced by introducing GhostNet-based convolutional module, RepGFPN-based Neck module optimization, CA and Transformer's attention mechanism, and loss function improvement using NWD.
The improved model shows significant superiority in dealing with complex backgrounds and tiny targets in real-world application tests.
- Score: 9.125818713673366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, the limitations of YOLOv5s model on small target detection
task are deeply studied and improved. The performance of the model is
successfully enhanced by introducing GhostNet-based convolutional module,
RepGFPN-based Neck module optimization, CA and Transformer's attention
mechanism, and loss function improvement using NWD. The experimental results
validate the positive impact of these improvement strategies on model
precision, recall and mAP. In particular, the improved model shows significant
superiority in dealing with complex backgrounds and tiny targets in real-world
application tests. This study provides an effective optimization strategy for
the YOLOv5s model on small target detection, and lays a solid foundation for
future related research and applications.
Related papers
- Spatial Transformer Network YOLO Model for Agricultural Object Detection [0.3124884279860061]
We propose a new method that integrates spatial transformer networks (STNs) into YOLO to improve performance.
The proposed STN-YOLO aims to enhance the model's effectiveness by focusing on important areas of the image.
We apply the STN-YOLO on benchmark datasets for Agricultural object detection as well as a new dataset from a state-of-the-art plant phenotyping greenhouse facility.
arXiv Detail & Related papers (2024-07-31T14:53:41Z) - YOLOv5, YOLOv8 and YOLOv10: The Go-To Detectors for Real-time Vision [0.6662800021628277]
This paper focuses on the evolution of the YOLO (You Only Look Once) object detection algorithm, focusing on YOLOv5, YOLOv8, and YOLOv10.
We analyze the architectural advancements, performance improvements, and suitability for edge deployment across these versions.
arXiv Detail & Related papers (2024-07-03T10:40:20Z) - Precision and Adaptability of YOLOv5 and YOLOv8 in Dynamic Robotic Environments [0.0]
This study provides a comparative analysis of YOLOv5 and YOLOv8 models.
Contrary to initial expectations, YOLOv5 models demonstrated comparable, and in some cases superior, precision in object detection tasks.
arXiv Detail & Related papers (2024-06-01T06:17:43Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Learning Objective-Specific Active Learning Strategies with Attentive
Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting.
We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem.
Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Simplifying Model-based RL: Learning Representations, Latent-space
Models, and Policies with One Objective [142.36200080384145]
We propose a single objective which jointly optimize a latent-space model and policy to achieve high returns while remaining self-consistent.
We demonstrate that the resulting algorithm matches or improves the sample-efficiency of the best prior model-based and model-free RL methods.
arXiv Detail & Related papers (2022-09-18T03:51:58Z) - YOLOv5s-GTB: light-weighted and improved YOLOv5s for bridge crack
detection [0.0]
This study proposes a light-weighted, high-precision, deep learning-based bridge apparent crack recognition model that can be deployed in mobile devices' scenarios.
YOLOv5 is identified as the basic framework for the light-weighted crack detection model through experiments for comparison and validation.
The improved model has 42% fewer parameters and faster inference response, but also significantly outperforms the original model in terms of accuracy and mAP.
arXiv Detail & Related papers (2022-06-03T10:52:59Z) - MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks [37.529217646431825]
In Goal-oriented Reinforcement learning, relabeling the raw goals in past experience to provide agents with hindsight ability is a major solution to the reward sparsity problem.
We develop FGI (Foresight Goal Inference), a new relabeling strategy that relabels the goals by looking into the future with a learned dynamics model.
To improve sample efficiency, we propose to use the dynamics model to generate simulated trajectories for policy training.
arXiv Detail & Related papers (2021-05-13T15:07:23Z) - Progressive Self-Guided Loss for Salient Object Detection [102.35488902433896]
We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images.
Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
arXiv Detail & Related papers (2021-01-07T07:33:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.