YOLOCS: Object Detection based on Dense Channel Compression for Feature Spatial Solidification
- URL: http://arxiv.org/abs/2305.04170v6
- Date: Sun, 13 Oct 2024 12:58:36 GMT
- Title: YOLOCS: Object Detection based on Dense Channel Compression for Feature Spatial Solidification
- Authors: Lin Huang, Weisheng Li, Yujuan Tan, Linlin Shen, Jing Yu, Haojie Fu,
- Abstract summary: We introduce two innovative modules for backbone and head networks: the Dense Channel Compression for Feature Spatial Solidification Structure (DCFS) and the Asymmetric Multi-Level Compression Decoupled Head (ADH)
When integrated into the YOLOv5 model, these two modules demonstrate exceptional performance, resulting in a modified model referred to as YOLOCS.
- Score: 38.49525419649799
- License:
- Abstract: In this study, we examine the associations between channel features and convolutional kernels during the processes of feature purification and gradient backpropagation, with a focus on the forward and backward propagation within the network. Consequently, we propose a method called Dense Channel Compression for Feature Spatial Solidification. Drawing upon the central concept of this method, we introduce two innovative modules for backbone and head networks: the Dense Channel Compression for Feature Spatial Solidification Structure (DCFS) and the Asymmetric Multi-Level Compression Decoupled Head (ADH). When integrated into the YOLOv5 model, these two modules demonstrate exceptional performance, resulting in a modified model referred to as YOLOCS. Evaluated on the MSCOCO dataset, the large, medium, and small YOLOCS models yield AP of 50.1%, 47.6%, and 42.5%, respectively. Maintaining inference speeds remarkably similar to those of the YOLOv5 model, the large, medium, and small YOLOCS models surpass the YOLOv5 model's AP by 1.1%, 2.3%, and 5.2%, respectively.
Related papers
- Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule [50.260693393896716]
Diffusion models are cutting-edge generative models adept at producing diverse, high-quality images.
Recent techniques have been employed to automatically search for faster generation processes.
We introduce Flexiffusion, a novel training-free NAS paradigm designed to accelerate diffusion models.
arXiv Detail & Related papers (2024-09-26T06:28:05Z) - A Simple and Generalist Approach for Panoptic Segmentation [57.94892855772925]
Generalist vision models aim for one and the same architecture for a variety of vision tasks.
While such shared architecture may seem attractive, generalist models tend to be outperformed by their bespoken counterparts.
We address this problem by introducing two key contributions, without compromising the desirable properties of generalist models.
arXiv Detail & Related papers (2024-08-29T13:02:12Z) - FA-YOLO: Research On Efficient Feature Selection YOLO Improved Algorithm Based On FMDS and AGMF Modules [0.6047429555885261]
This paper introduces an efficient Fine-grained Multi-scale Dynamic Selection Module (FMDS Module) and an Adaptive Gated Multi-branch Focus Fusion Module (AGMF Module)
FMDS Module applies a more effective dynamic feature selection and fusion method on fine-grained multi-scale feature maps.
AGMF Module utilizes multiple parallel branches to perform complementary fusion of various features captured by the gated unit branch, FMDS Module branch, and Triplet branch.
arXiv Detail & Related papers (2024-08-29T07:22:16Z) - CriDiff: Criss-cross Injection Diffusion Framework via Generative Pre-train for Prostate Segmentation [60.61972883059688]
CriDiff is a two-stage feature injecting framework with a Crisscross Injection Strategy (CIS) and a Generative Pre-train (GP) approach for prostate segmentation.
To effectively learn multi-level of edge features and non-edge features, we proposed two parallel conditioners in the CIS.
The GP approach eases the inconsistency between the images features and the diffusion model without adding additional parameters.
arXiv Detail & Related papers (2024-06-20T10:46:50Z) - Fostc3net:A Lightweight YOLOv5 Based On the Network Structure Optimization [11.969138981034247]
This paper presents an enhanced lightweight YOLOv5 technique customized for mobile devices.
The proposed model achieves a 1% increase in detection accuracy, a 13% reduction in FLOPs, and a 26% decrease in model parameters compared to the existing YOLOv5.
arXiv Detail & Related papers (2024-03-20T16:07:04Z) - YOLO-TLA: An Efficient and Lightweight Small Object Detection Model based on YOLOv5 [19.388112026410045]
YOLO-TLA is an advanced object detection model building on YOLOv5.
We first introduce an additional detection layer for small objects in the neck network pyramid architecture.
This module uses sliding window feature extraction, which effectively minimizes both computational demand and the number of parameters.
arXiv Detail & Related papers (2024-02-22T05:55:17Z) - ASF-YOLO: A Novel YOLO Model with Attentional Scale Sequence Fusion for Cell Instance Segmentation [6.502259209532815]
We propose an Attentional Scale Sequence Fusion based You Only Look Once (YOLO) framework (ASF-YOLO)
It combines spatial and scale features for accurate and fast cell instance segmentation.
It achieves a box mAP of 0.91, mask mAP of 0.887, and an inference speed of 47.3 FPS on the 2018 Data Science Bowl dataset.
arXiv Detail & Related papers (2023-12-11T15:47:12Z) - HIC-YOLOv5: Improved YOLOv5 For Small Object Detection [2.4780916008623834]
An improved YOLOv5 model: HIC-YOLOv5 is proposed to address the aforementioned problems.
An involution block is adopted between the backbone and neck to increase channel information of the feature map.
Our result shows that HIC-YOLOv5 has improved mAP@[.5:.95] by 6.42% and mAP@0.5 by 9.38% on VisDrone 2019-DET dataset.
arXiv Detail & Related papers (2023-09-28T12:40:36Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time
Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS.
We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.
Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers.
Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module.
Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.