RHA-Net: An Encoder-Decoder Network with Residual Blocks and Hybrid
Attention Mechanisms for Pavement Crack Segmentation
- URL: http://arxiv.org/abs/2207.14166v1
- Date: Thu, 28 Jul 2022 15:26:01 GMT
- Title: RHA-Net: An Encoder-Decoder Network with Residual Blocks and Hybrid
Attention Mechanisms for Pavement Crack Segmentation
- Authors: Guijie Zhu, Zhun Fan, Jiacheng Liu, Duan Yuan, Peili Ma, Meihua Wang,
Weihua Sheng, Kelvin C. P. Wang
- Abstract summary: RHA-Net is built by integrating residual blocks (ResBlocks) and hybrid attention blocks into the encoder-decoder architecture.
The developed system can segment pavement crack in real-time on an embedded device Jetson TX2 (25 FPS)
- Score: 7.972704288200679
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The acquisition and evaluation of pavement surface data play an essential
role in pavement condition evaluation. In this paper, an efficient and
effective end-to-end network for automatic pavement crack segmentation, called
RHA-Net, is proposed to improve the pavement crack segmentation accuracy. The
RHA-Net is built by integrating residual blocks (ResBlocks) and hybrid
attention blocks into the encoder-decoder architecture. The ResBlocks are used
to improve the ability of RHA-Net to extract high-level abstract features. The
hybrid attention blocks are designed to fuse both low-level features and
high-level features to help the model focus on correct channels and areas of
cracks, thereby improving the feature presentation ability of RHA-Net. An image
data set containing 789 pavement crack images collected by a self-designed
mobile robot is constructed and used for training and evaluating the proposed
model. Compared with other state-of-the-art networks, the proposed model
achieves better performance and the functionalities of adding residual blocks
and hybrid attention mechanisms are validated in a comprehensive ablation
study. Additionally, a light-weighted version of the model generated by
introducing depthwise separable convolution achieves better a performance and a
much faster processing speed with 1/30 of the number of U-Net parameters. The
developed system can segment pavement crack in real-time on an embedded device
Jetson TX2 (25 FPS). The video taken in real-time experiments is released at
https://youtu.be/3XIogk0fiG4.
Related papers
- Any Image Restoration via Efficient Spatial-Frequency Degradation Adaptation [158.37640586809187]
Restoring any degraded image efficiently via just one model has become increasingly significant.
Our approach, termed AnyIR, takes a unified path that leverages inherent similarity across various degradations.
To fuse the degradation awareness and the contextualized attention, a spatial-frequency parallel fusion strategy is proposed.
arXiv Detail & Related papers (2025-04-19T09:54:46Z) - Low-Level Matters: An Efficient Hybrid Architecture for Robust Multi-frame Infrared Small Target Detection [5.048364655933007]
Multi-frame infrared small target detection plays a crucial role in low-altitude and maritime surveillance.
The hybrid architecture combining CNNs and Transformers shows great promise for enhancing multi-frame IRSTD.
We propose LVNet, a simple yet powerful hybrid architecture that redefines low-level feature learning hybrid frameworks.
arXiv Detail & Related papers (2025-03-04T02:53:25Z) - Context-CrackNet: A Context-Aware Framework for Precise Segmentation of Tiny Cracks in Pavement images [3.9599054392856483]
This study proposes Context-CrackNet, a novel encoder-decoder architecture featuring the Region-Focused Enhancement Module (RFEM) and Context-Aware Global Module (CAGM)
The model consistently outperformed 9 state-of-the-art segmentation frameworks, achieving superior performance metrics such as mIoU and Dice score.
The model's balance of precision and computational efficiency highlights its potential for real-time deployment in large-scale pavement monitoring systems.
arXiv Detail & Related papers (2025-01-24T11:28:17Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Hybrid-Segmentor: A Hybrid Approach to Automated Fine-Grained Crack Segmentation in Civil Infrastructure [52.2025114590481]
We introduce Hybrid-Segmentor, an encoder-decoder based approach that is capable of extracting both fine-grained local and global crack features.
This allows the model to improve its generalization capabilities in distinguish various type of shapes, surfaces and sizes of cracks.
The proposed model outperforms existing benchmark models across 5 quantitative metrics (accuracy 0.971, precision 0.804, recall 0.744, F1-score 0.770, and IoU score 0.630), achieving state-of-the-art status.
arXiv Detail & Related papers (2024-09-04T16:47:16Z) - Any Image Restoration with Efficient Automatic Degradation Adaptation [132.81912195537433]
We propose a unified manner to achieve joint embedding by leveraging the inherent similarities across various degradations for efficient and comprehensive restoration.
Our network sets new SOTA records while reducing model complexity by approximately -82% in trainable parameters and -85% in FLOPs.
arXiv Detail & Related papers (2024-07-18T10:26:53Z) - HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution [6.7341750484636975]
Transformer-based networks can only use input information from a limited spatial range.
A novel Hybrid Multi-Axis Aggregation network (HMA) is proposed in this paper to exploit feature potential information better.
The experimental results show that HMA outperforms the state-of-the-art methods on the benchmark dataset.
arXiv Detail & Related papers (2024-05-08T12:14:34Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data.
We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising.
Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z) - DANet: Enhancing Small Object Detection through an Efficient Deformable
Attention Network [0.0]
We propose a comprehensive strategy by synergizing Faster R-CNN with cutting-edge methods.
By combining Faster R-CNN with Feature Pyramid Network, we enable the model to handle multi-scale features intrinsic to manufacturing environments.
Deformable Net is used that contorts and conforms to the geometric variations of defects, bringing precision in detecting even the minuscule and complex features.
arXiv Detail & Related papers (2023-10-09T14:54:37Z) - Real-time High-Resolution Neural Network with Semantic Guidance for
Crack Segmentation [4.651261550392625]
This paper describes HrSegNet, a high-resolution network with semantic guidance specifically designed for crack segmentation.
HrSegNet guarantees real-time inference speed while preserving crack details.
This approach demonstrates that there is a trade-off between high-resolution modeling and real-time detection.
arXiv Detail & Related papers (2023-07-01T08:38:18Z) - CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization [61.71504948770445]
We propose a novel channel pruning method via Class-Aware Trace Ratio Optimization (CATRO) to reduce the computational burden and accelerate the model inference.
We show that CATRO achieves higher accuracy with similar cost or lower cost with similar accuracy than other state-of-the-art channel pruning algorithms.
Because of its class-aware property, CATRO is suitable to prune efficient networks adaptively for various classification subtasks, enhancing handy deployment and usage of deep networks in real-world applications.
arXiv Detail & Related papers (2021-10-21T06:26:31Z) - Efficient Spatio-Temporal Recurrent Neural Network for Video Deblurring [39.63844562890704]
Real-time deblurring still remains a challenging task due to the complexity of spatially and temporally varying blur itself.
We adopt residual dense blocks into RNN cells, so as to efficiently extract the spatial features of the current frame.
We contribute a novel dataset (BSD) to the community, by collecting paired/sharp video clips using a co-axis beam splitter acquisition system.
arXiv Detail & Related papers (2021-06-30T12:53:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.