Related papers: FA-YOLO: Research On Efficient Feature Selection YOLO Improved Algorithm Based On FMDS and AGMF Modules

FA-YOLO: Research On Efficient Feature Selection YOLO Improved Algorithm Based On FMDS and AGMF Modules

URL: http://arxiv.org/abs/2408.16313v1
Date: Thu, 29 Aug 2024 07:22:16 GMT
Title: FA-YOLO: Research On Efficient Feature Selection YOLO Improved Algorithm Based On FMDS and AGMF Modules
Authors: Yukang Huo, Mingyuan Yao, Qingbin Tian, Tonghao Wang, Ruifeng Wang, Haihua Wang,
Abstract summary: This paper introduces an efficient Fine-grained Multi-scale Dynamic Selection Module (FMDS Module) and an Adaptive Gated Multi-branch Focus Fusion Module (AGMF Module) FMDS Module applies a more effective dynamic feature selection and fusion method on fine-grained multi-scale feature maps. AGMF Module utilizes multiple parallel branches to perform complementary fusion of various features captured by the gated unit branch, FMDS Module branch, and Triplet branch.
Score: 0.6047429555885261
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Over the past few years, the YOLO series of models has emerged as one of the dominant methodologies in the realm of object detection. Many studies have advanced these baseline models by modifying their architectures, enhancing data quality, and developing new loss functions. However, current models still exhibit deficiencies in processing feature maps, such as overlooking the fusion of cross-scale features and a static fusion approach that lacks the capability for dynamic feature adjustment. To address these issues, this paper introduces an efficient Fine-grained Multi-scale Dynamic Selection Module (FMDS Module), which applies a more effective dynamic feature selection and fusion method on fine-grained multi-scale feature maps, significantly enhancing the detection accuracy of small, medium, and large-sized targets in complex environments. Furthermore, this paper proposes an Adaptive Gated Multi-branch Focus Fusion Module (AGMF Module), which utilizes multiple parallel branches to perform complementary fusion of various features captured by the gated unit branch, FMDS Module branch, and TripletAttention branch. This approach further enhances the comprehensiveness, diversity, and integrity of feature fusion. This paper has integrated the FMDS Module, AGMF Module, into Yolov9 to develop a novel object detection model named FA-YOLO. Extensive experimental results show that under identical experimental conditions, FA-YOLO achieves an outstanding 66.1% mean Average Precision (mAP) on the PASCAL VOC 2007 dataset, representing 1.0% improvement over YOLOv9's 65.1%. Additionally, the detection accuracies of FA-YOLO for small, medium, and large targets are 44.1%, 54.6%, and 70.8%, respectively, showing improvements of 2.0%, 3.1%, and 0.9% compared to YOLOv9's 42.1%, 51.5%, and 69.9%.

Related papers

SOD-YOLO: Enhancing YOLO-Based Detection of Small Objects in UAV Imagery [5.639904484784127]
Experimental results demonstrate that SOD-YOLO significantly improves detection performance.<n>SOD-YOLO is a practical and efficient solution for small object detection in UAV imagery.
arXiv Detail & Related papers (2025-07-17T02:04:54Z)
Efficient Federated Learning with Heterogeneous Data and Adaptive Dropout [62.73150122809138]
Federated Learning (FL) is a promising distributed machine learning approach that enables collaborative training of a global model using multiple edge devices.<n>We propose the FedDHAD FL framework, which comes with two novel methods: Dynamic Heterogeneous model aggregation (FedDH) and Adaptive Dropout (FedAD)<n>The combination of these two methods makes FedDHAD significantly outperform state-of-the-art solutions in terms of accuracy (up to 6.7% higher), efficiency (up to 2.02 times faster), and cost (up to 15.0% smaller)
arXiv Detail & Related papers (2025-07-14T16:19:00Z)
Efficient Federated Learning with Timely Update Dissemination [54.668309196009204]
Federated Learning (FL) has emerged as a compelling methodology for the management of distributed data.<n>We propose an efficient FL approach that capitalizes on additional downlink bandwidth resources to ensure timely update dissemination.
arXiv Detail & Related papers (2025-07-08T14:34:32Z)
YOLOv11-RGBT: Towards a Comprehensive Single-Stage Multispectral Object Detection Framework [23.587408383983163]
We present YOLOv11-RGBT, a new comprehensive multimodal object detection framework.<n>We designed six multispectral fusion modes and successfully applied them to models from YOLOv3 to YOLOv12 and RT-DETR.<n>Experiments show our framework excels on three major open-source multispectral object detection datasets, like LLVIP and FLIR.
arXiv Detail & Related papers (2025-06-17T16:37:00Z)
CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [54.85000884785013]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z)
Combatting Dimensional Collapse in LLM Pre-Training Data via Diversified File Selection [65.96556073745197]
DiverSified File selection algorithm (DiSF) is proposed to select the most decorrelated text files in the feature space. DiSF saves 98.5% of 590M training files in SlimPajama, outperforming the full-data pre-training within a 50B training budget.
arXiv Detail & Related papers (2025-04-29T11:13:18Z)
MASF-YOLO: An Improved YOLOv11 Network for Small Object Detection on Drone View [0.0]
We propose a novel object detection network Multi-scale Context Aggregation and Scale-adaptive Fusion YOLO (MASF-YOLO) To tackle the difficulty of detecting small objects in UAV images, we design a Multi-scale Feature Aggregation Module (MFAM), which significantly improves the detection accuracy of small objects. Thirdly, we introduce a Dimension-Aware Selective Integration Module (DASI), which further enhances multi-scale feature fusion capabilities.
arXiv Detail & Related papers (2025-04-25T07:43:33Z)
YOLO-RS: Remote Sensing Enhanced Crop Detection Methods [0.32985979395737786]
Existing target detection methods show poor performance when dealing with small targets in remote sensing images. YOLO-RS is based on the latest Yolov11 which significantly enhances the detection of small targets. Experiments validate the effectiveness and application potential of YOLO-RS in the task of detecting small targets in remote sensing images.
arXiv Detail & Related papers (2025-04-15T13:13:22Z)
Enhancing and Accelerating Diffusion-Based Inverse Problem Solving through Measurements Optimization [66.17291150498276]
We introduce textbfMeasurements textbfOptimization (MO), a more efficient plug-and-play module for integrating measurement information at each step of the inverse problem-solving process. By using MO, we establish state-of-the-art (SOTA) performance across multiple tasks, with key advantages.
arXiv Detail & Related papers (2024-12-05T07:44:18Z)
Building Math Agents with Multi-Turn Iterative Preference Learning [56.71330214021884]
This paper studies the complementary direct preference learning approach to further improve model performance. Existing direct preference learning algorithms are originally designed for the single-turn chat task. We introduce a multi-turn direct preference learning framework, tailored for this context.
arXiv Detail & Related papers (2024-09-04T02:41:04Z)
Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection [3.7793767915135295]
We propose a new model named MAF-YOLO in this paper. It is a novel object detection framework with a versatile neck named Multi-Branch Auxiliary FPN (MAFPN) Taking the nano version of MAF-YOLO for example, it can achieve 42.4% AP on COCO with only 3.76M learnable parameters and 10.51G FLOPs, and approximately outperforms YOLOv8n by about 5.1%.
arXiv Detail & Related papers (2024-07-05T09:35:30Z)
SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection [59.868772767818975]
We propose a simple yet effective Semi-supervised Oriented Object Detection method termed SOOD++. Specifically, we observe that objects from aerial images are usually arbitrary orientations, small scales, and aggregation. Extensive experiments conducted on various multi-oriented object datasets under various labeled settings demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-07-01T07:03:51Z)
YOLO-TLA: An Efficient and Lightweight Small Object Detection Model based on YOLOv5 [19.388112026410045]
YOLO-TLA is an advanced object detection model building on YOLOv5. We first introduce an additional detection layer for small objects in the neck network pyramid architecture. This module uses sliding window feature extraction, which effectively minimizes both computational demand and the number of parameters.
arXiv Detail & Related papers (2024-02-22T05:55:17Z)
ASF-YOLO: A Novel YOLO Model with Attentional Scale Sequence Fusion for Cell Instance Segmentation [6.502259209532815]
We propose an Attentional Scale Sequence Fusion based You Only Look Once (YOLO) framework (ASF-YOLO) It combines spatial and scale features for accurate and fast cell instance segmentation. It achieves a box mAP of 0.91, mask mAP of 0.887, and an inference speed of 47.3 FPS on the 2018 Data Science Bowl dataset.
arXiv Detail & Related papers (2023-12-11T15:47:12Z)
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS. We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets. Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z)
MLF-DET: Multi-Level Fusion for Cross-Modal 3D Object Detection [54.52102265418295]
We propose a novel and effective Multi-Level Fusion network, named as MLF-DET, for high-performance cross-modal 3D object DETection. For the feature-level fusion, we present the Multi-scale Voxel Image fusion (MVI) module, which densely aligns multi-scale voxel features with image features. For the decision-level fusion, we propose the lightweight Feature-cued Confidence Rectification (FCR) module, which exploits image semantics to rectify the confidence of detection candidates.
arXiv Detail & Related papers (2023-07-18T11:26:02Z)
YOLOCS: Object Detection based on Dense Channel Compression for Feature Spatial Solidification [38.49525419649799]
We introduce two innovative modules for backbone and head networks: the Dense Channel Compression for Feature Spatial Solidification Structure (DCFS) and the Asymmetric Multi-Level Compression Decoupled Head (ADH) When integrated into the YOLOv5 model, these two modules demonstrate exceptional performance, resulting in a modified model referred to as YOLOCS.
arXiv Detail & Related papers (2023-05-07T03:00:06Z)
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm [111.17100512647619]
This paper explains the rationality of Vision Transformer by analogy with the proven practical evolutionary algorithm (EA) We propose a novel pyramid EATFormer backbone that only contains the proposed EA-based transformer (EAT) block. Massive quantitative and quantitative experiments on image classification, downstream tasks, and explanatory experiments demonstrate the effectiveness and superiority of our approach.
arXiv Detail & Related papers (2022-06-19T04:49:35Z)
LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray Image [7.970559381165446]
We propose a weld defect detection method based on convolution neural network (CNN), namely Lighter and Faster YOLO (LF-YOLO) To improve the performance of detection network, we propose an efficient feature extraction (EFE) module. Experimental results show that our weld defect network achieves satisfactory balance between performance and consumption, and reaches 92.9 mAP50 with 61.5 FPS.
arXiv Detail & Related papers (2021-10-28T12:19:32Z)
Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection [91.43066633305662]
The main purpose of RGB-D salient object detection (SOD) is how to better integrate and utilize cross-modal fusion information. In this paper, we explore these issues from a new perspective. We implement a kind of more flexible and efficient multi-scale cross-modal feature processing.
arXiv Detail & Related papers (2020-07-13T07:59:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.