A Real-time Concrete Crack Detection and Segmentation Model Based on YOLOv11
- URL: http://arxiv.org/abs/2508.11517v1
- Date: Fri, 15 Aug 2025 14:57:00 GMT
- Title: A Real-time Concrete Crack Detection and Segmentation Model Based on YOLOv11
- Authors: Shaoze Huang, Qi Liu, Chao Chen, Yuhang Chen,
- Abstract summary: This paper proposes YOLOv11-KW-TA-FP, a multi-task concrete crack detection and segmentation model based on the YOLOv11n architecture.<n> Experimental validation demonstrates that the enhanced model achieves significant performance improvements over the baseline.
- Score: 15.031712782615797
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accelerated aging of transportation infrastructure in the rapidly developing Yangtze River Delta region necessitates efficient concrete crack detection, as crack deterioration critically compromises structural integrity and regional economic growth. To overcome the limitations of inefficient manual inspection and the suboptimal performance of existing deep learning models, particularly for small-target crack detection within complex backgrounds, this paper proposes YOLOv11-KW-TA-FP, a multi-task concrete crack detection and segmentation model based on the YOLOv11n architecture. The proposed model integrates a three-stage optimization framework: (1) Embedding dynamic KernelWarehouse convolution (KWConv) within the backbone network to enhance feature representation through a dynamic kernel sharing mechanism; (2) Incorporating a triple attention mechanism (TA) into the feature pyramid to strengthen channel-spatial interaction modeling; and (3) Designing an FP-IoU loss function to facilitate adaptive bounding box regression penalization. Experimental validation demonstrates that the enhanced model achieves significant performance improvements over the baseline, attaining 91.3% precision, 76.6% recall, and 86.4% mAP@50. Ablation studies confirm the synergistic efficacy of the proposed modules. Furthermore, robustness tests indicate stable performance under conditions of data scarcity and noise interference. This research delivers an efficient computer vision solution for automated infrastructure inspection, exhibiting substantial practical engineering value.
Related papers
- Modified TSception for Analyzing Driver Drowsiness and Mental Workload from EEG [6.767263284839525]
Driver drowsiness remains a primary cause of traffic accidents, necessitating the development of real-time, reliable detection systems.<n>This study presents a Modified TSception architecture designed for the robust assessment of driver fatigue using Electroencephalography (EEG)<n>The architecture's generalizability is validated on the STEW mental workload dataset.
arXiv Detail & Related papers (2025-12-25T17:48:11Z) - Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail [85.47497935739936]
Alpamayo-R1 (AR1) is a vision-language-action model that integrates Chain of Causation reasoning with trajectory planning.<n>We show AR1 achieves 12% improvement in planning accuracy on challenging cases compared to a trajectory-only baseline.<n>We plan to release AR1 models and a subset of the CoC in a future update.
arXiv Detail & Related papers (2025-10-30T01:25:34Z) - Transformer-Based Indirect Structural Health Monitoring of Rail Infrastructure with Attention-Driven Detection and Localization of Transient Defects [1.1782896991259]
We introduce an incremental synthetic data benchmark designed to evaluate model robustness against progressively complex challenges.<n>We evaluate several established unsupervised models alongside our proposed Attention-Focused Transformer.<n>Our proposed model achieves accuracy comparable to the state-of-the-art solution while demonstrating better inference speed.
arXiv Detail & Related papers (2025-10-08T23:01:53Z) - TCPO: Thought-Centric Preference Optimization for Effective Embodied Decision-making [75.29820290660065]
This paper proposes Thought-Centric Preference Optimization ( TCPO) for effective embodied decision-making.<n>It emphasizes the alignment of the model's intermediate reasoning process, mitigating the problem of model degradation.<n>Experiments in the ALFWorld environment demonstrate an average success rate of 26.67%, achieving a 6% improvement over RL4VLM.
arXiv Detail & Related papers (2025-09-10T11:16:21Z) - YOLO11-CR: a Lightweight Convolution-and-Attention Framework for Accurate Fatigue Driving Detection [0.0]
This paper introduces YOLO11-CR, a lightweight and efficient object detection model tailored for real-time fatigue monitoring.<n>YOLO11-CR introduces two key modules: the Convolution-and-Attention Fusion Module (CAFM) and the Rectangular Module (RCM)<n>Experiments on the DSM dataset demonstrated that YOLO11-CR achieves a precision of 87.17%, recall of 83.86%, mAP@50 of 88.09%, and mAP@50-95 of 55.93%.
arXiv Detail & Related papers (2025-08-16T07:19:04Z) - YOLO-APD: Enhancing YOLOv8 for Robust Pedestrian Detection on Complex Road Geometries [0.0]
This paper introduces YOLO-APD, a novel deep learning architecture enhancing the YOLOv8 framework specifically for this challenge.<n>YOLO-APD achieves state-of-the-art detection accuracy, reaching 77.7% mAP@0.5:0.95 and exceptional pedestrian recall exceeding 96%.<n>It maintains real-time processing capabilities at 100 FPS, showcasing a superior balance between accuracy and efficiency.
arXiv Detail & Related papers (2025-07-07T18:03:40Z) - High-Fidelity Scientific Simulation Surrogates via Adaptive Implicit Neural Representations [51.90920900332569]
Implicit neural representations (INRs) offer a compact and continuous framework for modeling spatially structured data.<n>Recent approaches address this by introducing additional features along rigid geometric structures.<n>We propose a simple yet effective alternative: Feature-Adaptive INR (FA-INR)
arXiv Detail & Related papers (2025-06-07T16:45:17Z) - Crack Detection in Infrastructure Using Transfer Learning, Spatial Attention, and Genetic Algorithm Optimization [3.1687473999848836]
Crack detection plays a pivotal role in the maintenance and safety of infrastructure, including roads, bridges, and buildings.
Traditionally, manual inspection has been the norm, but it is labor-intensive, subjective, and hazardous.
This paper introduces an advanced approach for crack detection in infrastructure using deep learning, leveraging transfer learning, spatial attention mechanisms, and genetic algorithm(GA) optimization.
arXiv Detail & Related papers (2024-11-26T06:12:56Z) - EfficientCrackNet: A Lightweight Model for Crack Segmentation [1.3689715712707347]
Crack detection is crucial for maintaining the structural integrity of buildings, pavements, and bridges.
Existing lightweight methods often face challenges including computational inefficiency, complex crack patterns, and difficult backgrounds.
We propose EfficientCrackNet, a lightweight hybrid model combining Convolutional Neural Networks (CNNs) and transformers for precise crack segmentation.
arXiv Detail & Related papers (2024-09-26T17:44:20Z) - Hybrid-Segmentor: A Hybrid Approach to Automated Fine-Grained Crack Segmentation in Civil Infrastructure [52.2025114590481]
We introduce Hybrid-Segmentor, an encoder-decoder based approach that is capable of extracting both fine-grained local and global crack features.
This allows the model to improve its generalization capabilities in distinguish various type of shapes, surfaces and sizes of cracks.
The proposed model outperforms existing benchmark models across 5 quantitative metrics (accuracy 0.971, precision 0.804, recall 0.744, F1-score 0.770, and IoU score 0.630), achieving state-of-the-art status.
arXiv Detail & Related papers (2024-09-04T16:47:16Z) - YOLO9tr: A Lightweight Model for Pavement Damage Detection Utilizing a Generalized Efficient Layer Aggregation Network and Attention Mechanism [0.0]
This paper proposes YOLO9tr, a novel lightweight object detection model for pavement damage detection.
YOLO9tr is based on the YOLOv9 architecture, incorporating a partial attention block that enhances feature extraction and attention mechanisms.
The model achieves a high frame rate of up to 136 FPS, making it suitable for real-time applications such as video surveillance and automated inspection systems.
arXiv Detail & Related papers (2024-06-17T06:31:43Z) - Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving [55.93813178692077]
We present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms.<n>We assess 33 state-of-the-art BEV-based perception models spanning tasks like detection, map segmentation, depth estimation, and occupancy prediction.<n>Our experimental results also underline the efficacy of strategies like pre-training and depth-free BEV transformations in enhancing robustness against out-of-distribution data.
arXiv Detail & Related papers (2024-05-27T17:59:39Z) - InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic Reward Modeling [66.3072381478251]
Reward hacking, also termed reward overoptimization, remains a critical challenge.
We propose a framework for reward modeling, namely InfoRM, by introducing a variational information bottleneck objective.
We show that InfoRM's overoptimization detection mechanism is not only effective but also robust across a broad range of datasets.
arXiv Detail & Related papers (2024-02-14T17:49:07Z) - The Risk of Federated Learning to Skew Fine-Tuning Features and
Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model.
We introduce three robustness indicators and conduct experiments across diverse robust datasets.
Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.