Related papers: Real-Time Dynamic Scale-Aware Fusion Detection Network: Take Road Damage Detection as an example

Real-Time Dynamic Scale-Aware Fusion Detection Network: Take Road Damage Detection as an example

URL: http://arxiv.org/abs/2409.02546v1
Date: Wed, 4 Sep 2024 09:03:47 GMT
Title: Real-Time Dynamic Scale-Aware Fusion Detection Network: Take Road Damage Detection as an example
Authors: Weichao Pan, Xu Wang, Wenqing Huan,
Abstract summary: Road Damage Detection (RDD) is important for daily maintenance and safety in cities. Current UAV-based RDD research is still faces many challenges. We design a multi-scale, adaptive road damage detection model with the ability to automatically remove background interference.
Score: 3.334973867478745
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Unmanned Aerial Vehicle (UAV)-based Road Damage Detection (RDD) is important for daily maintenance and safety in cities, especially in terms of significantly reducing labor costs. However, current UAV-based RDD research is still faces many challenges. For example, the damage with irregular size and direction, the masking of damage by the background, and the difficulty of distinguishing damage from the background significantly affect the ability of UAV to detect road damage in daily inspection. To solve these problems and improve the performance of UAV in real-time road damage detection, we design and propose three corresponding modules: a feature extraction module that flexibly adapts to shape and background; a module that fuses multiscale perception and adapts to shape and background ; an efficient downsampling module. Based on these modules, we designed a multi-scale, adaptive road damage detection model with the ability to automatically remove background interference, called Dynamic Scale-Aware Fusion Detection Model (RT-DSAFDet). Experimental results on the UAV-PDD2023 public dataset show that our model RT-DSAFDet achieves a mAP50 of 54.2%, which is 11.1% higher than that of YOLOv10-m, an efficient variant of the latest real-time object detection model YOLOv10, while the amount of parameters is reduced to 1.8M and FLOPs to 4.6G, with a decreased by 88% and 93%, respectively. Furthermore, on the large generalized object detection public dataset MS COCO2017 also shows the superiority of our model with mAP50-95 is the same as YOLOv9-t, but with 0.5% higher mAP50, 10% less parameters volume, and 40% less FLOPs.

Related papers

MRS-YOLO Railroad Transmission Line Foreign Object Detection Based on Improved YOLO11 and Channel Pruning [2.6795746856835785]
We propose an improved algorithm MRS-YOLO based on YOLO11.<n>The mAP50 and mAP50:95 of the MRS-YOLO algorithm are improved to 94.8% and 86.4%, respectively.
arXiv Detail & Related papers (2025-10-12T11:38:09Z)
YOLOv11-Litchi: Efficient Litchi Fruit Detection based on UAV-Captured Agricultural Imagery in Complex Orchard Environments [6.862722449907841]
This paper introduces YOLOv11-Litchi, a lightweight and robust detection model specifically designed for UAV-based litchi detection.<n>YOLOv11-Litchi achieves a parameter size of 6.35 MB - 32.5% smaller than the YOLOv11 baseline.<n>The model achieves a frame rate of 57.2 FPS, meeting real-time detection requirements.
arXiv Detail & Related papers (2025-10-11T09:44:00Z)
Enhancing Vehicle Detection under Adverse Weather Conditions with Contrastive Learning [4.675616844059]
We propose a sideload-CL-adaptation framework to improve vehicle detection using lightweight models.<n>Our proposed sideload-CL-adaptation model improves the detection performance by 3.8% to 9.5% in terms of mAP50 on the NVD dataset.
arXiv Detail & Related papers (2025-09-26T05:55:41Z)
YOLO-ROC: A High-Precision and Ultra-Lightweight Model for Real-Time Road Damage Detection [0.0]
Road damage detection is a critical task for ensuring traffic safety and maintaining infrastructure integrity.<n>This paper proposes a high-precision and lightweight model, YOLO - Road Orthogonal Compact (YOLO-ROC)
arXiv Detail & Related papers (2025-07-31T03:35:19Z)
SOD-YOLO: Enhancing YOLO-Based Detection of Small Objects in UAV Imagery [5.639904484784127]
Experimental results demonstrate that SOD-YOLO significantly improves detection performance.<n>SOD-YOLO is a practical and efficient solution for small object detection in UAV imagery.
arXiv Detail & Related papers (2025-07-17T02:04:54Z)
Practical Manipulation Model for Robust Deepfake Detection [55.2480439325792]
We develop a more real-world degradation model in the area of image super-resolution.<n>We extend the space of pseudo-fakes by using Poisson blending, more diverse masks, generator artifacts, and distractors.<n>We show clear increases of $3.51%$ and $6.21%$ AUC on the DFDC and DFDCP datasets, respectively.
arXiv Detail & Related papers (2025-06-05T15:06:16Z)
MASF-YOLO: An Improved YOLOv11 Network for Small Object Detection on Drone View [0.0]
We propose a novel object detection network Multi-scale Context Aggregation and Scale-adaptive Fusion YOLO (MASF-YOLO) To tackle the difficulty of detecting small objects in UAV images, we design a Multi-scale Feature Aggregation Module (MFAM), which significantly improves the detection accuracy of small objects. Thirdly, we introduce a Dimension-Aware Selective Integration Module (DASI), which further enhances multi-scale feature fusion capabilities.
arXiv Detail & Related papers (2025-04-25T07:43:33Z)
RemDet: Rethinking Efficient Model Design for UAV Object Detection [12.652666443395528]
Object detection in Unmanned Aerial Vehicle (UAV) images has emerged as a focal area of research. Current real-time object detectors are not optimized for UAV images. We propose a novel detector, RemDet, to address these challenges.
arXiv Detail & Related papers (2024-12-13T11:00:57Z)
SL-YOLO: A Stronger and Lighter Drone Target Detection Model [0.0]
This paper proposes a revolutionary model SL-YOLO (Stronger and Lighter YOLO) that aims to break the bottleneck of small target detection. We propose a pioneering cross-scale feature fusion method that can ensure unparalleled detection accuracy even in the most challenging environments. Our experimental results on the VisDrone 2019 dataset reveal a significant improvement in performance, with mAP@0.5 jumping from 43.0% to 46.9%. The model parameters are reduced from 11.1M to 9.6M, and the FPS can reach 132, making it an ideal solution for real-time small object detection in resource-constrained environments.
arXiv Detail & Related papers (2024-11-18T11:26:11Z)
Effective and Efficient Adversarial Detection for Vision-Language Models via A Single Vector [97.92369017531038]
We build a new laRge-scale Adervsarial images dataset with Diverse hArmful Responses (RADAR) We then develop a novel iN-time Embedding-based AdveRSarial Image DEtection (NEARSIDE) method, which exploits a single vector that distilled from the hidden states of Visual Language Models (VLMs) to achieve the detection of adversarial images against benign ones in the input.
arXiv Detail & Related papers (2024-10-30T10:33:10Z)
YOLO-ELA: Efficient Local Attention Modeling for High-Performance Real-Time Insulator Defect Detection [0.0]
Existing detection methods for insulator defect identification from unmanned aerial vehicles struggle with complex background scenes and small objects. This paper proposes a new attention-based foundation architecture, YOLO-ELA, to address this issue. Experimental results on high-resolution UAV images show that our method achieved a state-of-the-art performance of 96.9% mAP0.5 and a real-time detection speed of 74.63 frames per second.
arXiv Detail & Related papers (2024-10-15T16:00:01Z)
Optimizing YOLO Architectures for Optimal Road Damage Detection and Classification: A Comparative Study from YOLOv7 to YOLOv10 [0.0]
This paper presents a comprehensive workflow for road damage detection using deep learning models. To accommodate hardware limitations, large images are cropped, and lightweight models are utilized. The proposed approach employs multiple model architectures, including a custom YOLOv7 model with Coordinate Attention layers and a Tiny YOLOv7 model.
arXiv Detail & Related papers (2024-10-10T22:55:12Z)
DAPONet: A Dual Attention and Partially Overparameterized Network for Real-Time Road Damage Detection [4.185368042845483]
We propose DAPONet to enhance real-time road damage detection using street view image data (SVRDD) DAPONet achieves a mAP50 of 70.1% on the SVRDD dataset, outperforming YOLOv10n by 10.4%, while reducing parameters to 1.6M and FLOPs to 1.7G, representing reductions of 41% and 80%, respectively. On the MS COCO 2017 val dataset, DAPONet achieves an mAP50-95 of 33.4%, 0.8% higher than EfficientDet-D1, with a 74% reduction in both parameters and FLOPs.
arXiv Detail & Related papers (2024-09-03T04:53:32Z)
From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution [4.107182710549721]
We present an innovative approach that combines super-resolution and an adapted lightweight YOLOv5 architecture. Our experimental results demonstrate the model's superior performance in detecting small and densely clustered objects.
arXiv Detail & Related papers (2024-01-26T05:50:58Z)
Diffusion-Based Particle-DETR for BEV Perception [94.88305708174796]
Bird-Eye-View (BEV) is one of the most widely-used scene representations for visual perception in Autonomous Vehicles (AVs) Recent diffusion-based methods offer a promising approach to uncertainty modeling for visual perception but fail to effectively detect small objects in the large coverage of the BEV. Here, we address this problem by combining the diffusion paradigm with current state-of-the-art 3D object detectors in BEV.
arXiv Detail & Related papers (2023-12-18T09:52:14Z)
Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head. The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement. This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z)
Vision Transformers, a new approach for high-resolution and large-scale mapping of canopy heights [50.52704854147297]
We present a new vision transformer (ViT) model optimized with a classification (discrete) and a continuous loss function. This model achieves better accuracy than previously used convolutional based approaches (ConvNets) optimized with only a continuous loss function.
arXiv Detail & Related papers (2023-04-22T22:39:03Z)
Pushing the Limits of Fewshot Anomaly Detection in Industry Vision: Graphcore [71.09522172098733]
We utilize graph representation in FSAD and provide a novel visual invariant feature (VIIF) as anomaly measurement feature. VIIF can robustly improve the anomaly discriminating ability and can further reduce the size of redundant features stored in M. Besides, we provide a novel model GraphCore via VIIFs that can fast implement unsupervised FSAD training and can improve the performance of anomaly detection.
arXiv Detail & Related papers (2023-01-28T03:58:32Z)
Lidar Light Scattering Augmentation (LISA): Physics-based Simulation of Adverse Weather Conditions for 3D Object Detection [60.89616629421904]
Lidar-based object detectors are critical parts of the 3D perception pipeline in autonomous navigation systems such as self-driving cars. They are sensitive to adverse weather conditions such as rain, snow and fog due to reduced signal-to-noise ratio (SNR) and signal-to-background ratio (SBR)
arXiv Detail & Related papers (2021-07-14T21:10:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.