YOLO-ROC: A High-Precision and Ultra-Lightweight Model for Real-Time Road Damage Detection
- URL: http://arxiv.org/abs/2507.23225v1
- Date: Thu, 31 Jul 2025 03:35:19 GMT
- Title: YOLO-ROC: A High-Precision and Ultra-Lightweight Model for Real-Time Road Damage Detection
- Authors: Zicheng Lin, Weichao Pan,
- Abstract summary: Road damage detection is a critical task for ensuring traffic safety and maintaining infrastructure integrity.<n>This paper proposes a high-precision and lightweight model, YOLO - Road Orthogonal Compact (YOLO-ROC)
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Road damage detection is a critical task for ensuring traffic safety and maintaining infrastructure integrity. While deep learning-based detection methods are now widely adopted, they still face two core challenges: first, the inadequate multi-scale feature extraction capabilities of existing networks for diverse targets like cracks and potholes, leading to high miss rates for small-scale damage; and second, the substantial parameter counts and computational demands of mainstream models, which hinder their deployment for efficient, real-time detection in practical applications. To address these issues, this paper proposes a high-precision and lightweight model, YOLO - Road Orthogonal Compact (YOLO-ROC). We designed a Bidirectional Multi-scale Spatial Pyramid Pooling Fast (BMS-SPPF) module to enhance multi-scale feature extraction and implemented a hierarchical channel compression strategy to reduce computational complexity. The BMS-SPPF module leverages a bidirectional spatial-channel attention mechanism to improve the detection of small targets. Concurrently, the channel compression strategy reduces the parameter count from 3.01M to 0.89M and GFLOPs from 8.1 to 2.6. Experiments on the RDD2022_China_Drone dataset demonstrate that YOLO-ROC achieves a mAP50 of 67.6%, surpassing the baseline YOLOv8n by 2.11%. Notably, the mAP50 for the small-target D40 category improved by 16.8%, and the final model size is only 2.0 MB. Furthermore, the model exhibits excellent generalization performance on the RDD2022_China_Motorbike dataset.
Related papers
- Geminet: Learning the Duality-based Iterative Process for Lightweight Traffic Engineering in Changing Topologies [53.38648279089736]
Geminet is a lightweight and scalable ML-based TE framework that can handle changing topologies.<n>Its neural network size is only 0.04% to 7% of existing schemes.<n>When trained on large-scale topologies, Geminet consumes under 10 GiB of memory, more than eight times less than the 80-plus GiB required by HARP.
arXiv Detail & Related papers (2025-06-30T09:09:50Z) - A lightweight model FDM-YOLO for small target improvement based on YOLOv8 [0.0]
Small targets are difficult to detect due to their low pixel count, complex backgrounds, and varying shooting angles.<n>This paper focuses on small target detection and explores methods for object detection under low computational constraints.
arXiv Detail & Related papers (2025-03-06T14:06:35Z) - YOLO-MST: Multiscale deep learning method for infrared small target detection based on super-resolution and YOLO [0.18641315013048293]
This paper proposes a deep-learning infrared small target detection method that combines image super-resolution technology with multi-scale observation.<n>The mAP@0.5 detection rates of this method on two public datasets, SIRST and IRIS, reached 96.4% and 99.5% respectively.
arXiv Detail & Related papers (2024-12-27T18:43:56Z) - DSORT-MCU: Detecting Small Objects in Real-Time on Microcontroller Units [1.4447019135112429]
This paper proposes an adaptive tiling method for lightweight and energy-efficient object detection networks, including YOLO-based models and the popular FOMO network.
The proposed tiling enables object detection on low-power MCUs with no compromise on accuracy compared to large-scale detection models.
arXiv Detail & Related papers (2024-10-22T07:37:47Z) - DAPONet: A Dual Attention and Partially Overparameterized Network for Real-Time Road Damage Detection [4.185368042845483]
We propose DAPONet to enhance real-time road damage detection using street view image data (SVRDD)
DAPONet achieves a mAP50 of 70.1% on the SVRDD dataset, outperforming YOLOv10n by 10.4%, while reducing parameters to 1.6M and FLOPs to 1.7G, representing reductions of 41% and 80%, respectively.
On the MS COCO 2017 val dataset, DAPONet achieves an mAP50-95 of 33.4%, 0.8% higher than EfficientDet-D1, with a 74% reduction in both parameters and FLOPs.
arXiv Detail & Related papers (2024-09-03T04:53:32Z) - SOD-YOLOv8 -- Enhancing YOLOv8 for Small Object Detection in Traffic Scenes [1.3812010983144802]
Small Object Detection YOLOv8 (SOD-YOLOv8) is designed for scenarios involving numerous small objects.
SOD-YOLOv8 significantly improves small object detection, surpassing widely used models in various metrics.
In dynamic real-world traffic scenes, SOD-YOLOv8 demonstrated notable improvements in diverse conditions.
arXiv Detail & Related papers (2024-08-08T23:05:25Z) - LeYOLO, New Embedded Architecture for Object Detection [0.0]
We introduce two key contributions to object detection models using MSCOCO as a base validation set.<n>First, we propose LeNeck, a general-purpose detection framework that maintains inference speed comparable to SSDLite.<n>Second, we present LeYOLO, an efficient object detection model designed to enhance computational efficiency in YOLO-based architectures.
arXiv Detail & Related papers (2024-06-20T12:08:24Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Rethinking Mobile Block for Efficient Attention-based Models [60.0312591342016]
This paper focuses on developing modern, efficient, lightweight models for dense predictions while trading off parameters, FLOPs, and performance.
Inverted Residual Block (IRB) serves as the infrastructure for lightweight CNNs, but no counterpart has been recognized by attention-based studies.
We extend CNN-based IRB to attention-based models and abstracting a one-residual Meta Mobile Block (MMB) for lightweight model design.
arXiv Detail & Related papers (2023-01-03T15:11:41Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks.
Current networks often occupy large number of parameters and require heavy computation costs.
Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.