mAPm: multi-scale Attention Pyramid module for Enhanced scale-variation
in RLD detection
- URL: http://arxiv.org/abs/2402.16291v1
- Date: Mon, 26 Feb 2024 04:18:42 GMT
- Title: mAPm: multi-scale Attention Pyramid module for Enhanced scale-variation
in RLD detection
- Authors: Yunusa Haruna, Shiyin Qin, Abdulrahman Hamman Adama Chukkol, Isah
Bello, Adamu Lawan
- Abstract summary: mAPm is a novel approach that integrates dilated convolutions into the Feature Pyramid Network (FPN) to enhance multi-scale information ex-traction.
We evaluate mAPm on YOLOv7 using the MRLD and COCO datasets.
- Score: 0.3499870393443268
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting objects across various scales remains a significant challenge in
computer vision, particularly in tasks such as Rice Leaf Disease (RLD)
detection, where objects exhibit considerable scale variations. Traditional
object detection methods often struggle to address these variations, resulting
in missed detections or reduced accuracy. In this study, we propose the
multi-scale Attention Pyramid module (mAPm), a novel approach that integrates
dilated convolutions into the Feature Pyramid Network (FPN) to enhance
multi-scale information ex-traction. Additionally, we incorporate a global
Multi-Head Self-Attention (MHSA) mechanism and a deconvolutional layer to
refine the up-sampling process. We evaluate mAPm on YOLOv7 using the MRLD and
COCO datasets. Compared to vanilla FPN, BiFPN, NAS-FPN, PANET, and ACFPN, mAPm
achieved a significant improvement in Average Precision (AP), with a +2.61%
increase on the MRLD dataset compared to the baseline FPN method in YOLOv7.
This demonstrates its effectiveness in handling scale variations. Furthermore,
the versatility of mAPm allows its integration into various FPN-based object
detection models, showcasing its potential to advance object detection
techniques.
Related papers
- PolSAM: Polarimetric Scattering Mechanism Informed Segment Anything Model [76.95536611263356]
PolSAR data presents unique challenges due to its rich and complex characteristics.
Existing data representations, such as complex-valued data, polarimetric features, and amplitude images, are widely used.
Most feature extraction networks for PolSAR are small, limiting their ability to capture features effectively.
We propose the Polarimetric Scattering Mechanism-Informed SAM (PolSAM), an enhanced Segment Anything Model (SAM) that integrates domain-specific scattering characteristics and a novel prompt generation strategy.
arXiv Detail & Related papers (2024-12-17T09:59:53Z) - Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection [3.7793767915135295]
We propose a new model named MAF-YOLO in this paper.
It is a novel object detection framework with a versatile neck named Multi-Branch Auxiliary FPN (MAFPN)
Taking the nano version of MAF-YOLO for example, it can achieve 42.4% AP on COCO with only 3.76M learnable parameters and 10.51G FLOPs, and approximately outperforms YOLOv8n by about 5.1%.
arXiv Detail & Related papers (2024-07-05T09:35:30Z) - Multi-scale Quaternion CNN and BiGRU with Cross Self-attention Feature Fusion for Fault Diagnosis of Bearing [5.3598912592106345]
Deep learning has led to significant advances in bearing fault diagnosis (FD)
We propose a novel FD model by integrating multiscale quaternion convolutional neural network (MQCNN), bidirectional gated recurrent unit (BiG), and cross self-attention feature fusion (CSAFF)
arXiv Detail & Related papers (2024-05-25T07:55:02Z) - MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection [54.545054873239295]
Deepfakes have recently raised significant trust issues and security concerns among the public.
ViT-based methods take advantage of the expressivity of transformers, achieving superior detection performance.
This work introduces Mixture-of-Experts modules for Face Forgery Detection (MoE-FFD), a generalized yet parameter-efficient ViT-based approach.
arXiv Detail & Related papers (2024-04-12T13:02:08Z) - Joint Attention-Guided Feature Fusion Network for Saliency Detection of
Surface Defects [69.39099029406248]
We propose a joint attention-guided feature fusion network (JAFFNet) for saliency detection of surface defects based on the encoder-decoder network.
JAFFNet mainly incorporates a joint attention-guided feature fusion (JAFF) module into decoding stages to adaptively fuse low-level and high-level features.
Experiments conducted on SD-saliency-900, Magnetic tile, and DAGM 2007 indicate that our method achieves promising performance in comparison with other state-of-the-art methods.
arXiv Detail & Related papers (2024-02-05T08:10:16Z) - PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly
Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce.
We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD.
Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z) - LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray
Image [7.970559381165446]
We propose a weld defect detection method based on convolution neural network (CNN), namely Lighter and Faster YOLO (LF-YOLO)
To improve the performance of detection network, we propose an efficient feature extraction (EFE) module.
Experimental results show that our weld defect network achieves satisfactory balance between performance and consumption, and reaches 92.9 mAP50 with 61.5 FPS.
arXiv Detail & Related papers (2021-10-28T12:19:32Z) - Hierarchical Dynamic Filtering Network for RGB-D Salient Object
Detection [91.43066633305662]
The main purpose of RGB-D salient object detection (SOD) is how to better integrate and utilize cross-modal fusion information.
In this paper, we explore these issues from a new perspective.
We implement a kind of more flexible and efficient multi-scale cross-modal feature processing.
arXiv Detail & Related papers (2020-07-13T07:59:55Z) - Random Partitioning Forest for Point-Wise and Collective Anomaly
Detection -- Application to Intrusion Detection [9.74672460306765]
DiFF-RF is an ensemble approach composed of random partitioning binary trees to detect anomalies.
Our experiments show that DiFF-RF almost systematically outperforms the isolation forest (IF) algorithm.
Our experience shows that DiFF-RF can work well in the presence of small-scale learning data.
arXiv Detail & Related papers (2020-06-29T10:44:08Z) - Salient Object Detection Combining a Self-attention Module and a Feature
Pyramid Network [10.81245352773775]
We propose a novel pyramid self-attention module (PSAM) and the adoption of an independent feature-complementing strategy.
In PSAM, self-attention layers are equipped after multi-scale pyramid features to capture richer high-level features and bring larger receptive fields to the model.
arXiv Detail & Related papers (2020-04-30T03:08:34Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.