Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of
Multiclass Defect Segmentation
- URL: http://arxiv.org/abs/2312.14053v1
- Date: Thu, 21 Dec 2023 17:23:49 GMT
- Title: Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of
Multiclass Defect Segmentation
- Authors: Rasha Alshawi, Md Tamjidul Hoque, Md Meftahul Ferdaus, Mahdi
Abdelguerfi, Kendall Niles, Ken Prathak, Joe Tom, Jordan Klein, Murtada
Mousa, and Johny Javier Lopez
- Abstract summary: The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation.
DAU-FI Net integrates multiscale spatial-channel attention mechanisms and feature injection to enhance precision in object localization.
Comprehensive experiments on a challenging sewer pipe and culvert defect dataset and a benchmark dataset validate DAU-FI Net's capabilities.
- Score: 1.487252325779766
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI
Net), addresses challenges in semantic segmentation, particularly on multiclass
imbalanced datasets with limited samples. DAU-FI Net integrates multiscale
spatial-channel attention mechanisms and feature injection to enhance precision
in object localization. The core employs a multiscale depth-separable
convolution block, capturing localized patterns across scales. This block is
complemented by a spatial-channel squeeze and excitation (scSE) attention unit,
modeling inter-dependencies between channels and spatial regions in feature
maps. Additionally, additive attention gates refine segmentation by connecting
encoder-decoder pathways.
To augment the model, engineered features using Gabor filters for textural
analysis, Sobel and Canny filters for edge detection are injected guided by
semantic masks to expand the feature space strategically. Comprehensive
experiments on a challenging sewer pipe and culvert defect dataset and a
benchmark dataset validate DAU-FI Net's capabilities. Ablation studies
highlight incremental benefits from attention blocks and feature injection.
DAU-FI Net achieves state-of-the-art mean Intersection over Union (IoU) of
95.6% and 98.8% on the defect test set and benchmark respectively, surpassing
prior methods by 8.9% and 12.6%, respectively. Ablation studies highlight
incremental benefits from attention blocks and feature injection. The proposed
architecture provides a robust solution, advancing semantic segmentation for
multiclass problems with limited training data. Our sewer-culvert defects
dataset, featuring pixel-level annotations, opens avenues for further research
in this crucial domain. Overall, this work delivers key innovations in
architecture, attention, and feature engineering to elevate semantic
segmentation efficacy.
Related papers
- Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery [51.83786195178233]
We design a Knowledge Discovery Network (KDN) to implement the renormalization group theory in terms of efficient feature extraction.
Renormalized connection (RC) on the KDN enables synergistic focusing'' of multi-scale features.
RCs extend the multi-level feature's divide-and-conquer'' mechanism of the FPN-based detectors to a wide range of scale-preferred tasks.
arXiv Detail & Related papers (2024-09-09T13:56:22Z) - PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - Learning Spatial-Semantic Features for Robust Video Object Segmentation [108.045326229865]
We propose a robust video object segmentation framework equipped with spatial-semantic features and discriminative object queries.
We show that the proposed method set a new state-of-the-art performance on multiple datasets.
arXiv Detail & Related papers (2024-07-10T15:36:00Z) - SNE-RoadSegV2: Advancing Heterogeneous Feature Fusion and Fallibility Awareness for Freespace Detection [29.348921424716057]
This paper presents a novel heterogeneous feature fusion block, comprising a holistic attention module, a heterogeneous feature contrast descriptor, and an affinity-weighted feature recalibrator.
It incorporates both inter-scale and intra-scale skip connections into the decoder architecture while eliminating redundant ones, leading to both improved accuracy and computational efficiency.
It introduces two fallibility-aware loss functions that separately focus on semantic-transition and depth-inconsistent regions, collectively contributing to greater supervision during model training.
arXiv Detail & Related papers (2024-02-29T07:20:02Z) - Accurate and lightweight dehazing via multi-receptive-field non-local
network and novel contrastive regularization [9.90146712189936]
This paper presents a multi-receptive-field non-local network (MRFNLN) for image dehazing.
It is designed as a multi-stream feature attention block (MSFAB) and cross non-local block (CNLB)
It outperforms recent state-of-the-art dehazing methods with less than 1.5 Million parameters.
arXiv Detail & Related papers (2023-09-28T14:59:16Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - TC-Net: Triple Context Network for Automated Stroke Lesion Segmentation [0.5482532589225552]
We propose a new network, Triple Context Network (TC-Net), with the capture of spatial contextual information as the core.
Our network is evaluated on the open dataset ATLAS, achieving the highest score of 0.594, Hausdorff distance of 27.005 mm, and average symmetry surface distance of 7.137 mm.
arXiv Detail & Related papers (2022-02-28T11:12:16Z) - An Attention-Fused Network for Semantic Segmentation of
Very-High-Resolution Remote Sensing Imagery [26.362854938949923]
We propose a novel convolutional neural network architecture, named attention-fused network (AFNet)
We achieve state-of-the-art performance with an overall accuracy of 91.7% and a mean F1 score of 90.96% on the ISPRS Vaihingen 2D dataset and the ISPRS Potsdam 2D dataset.
arXiv Detail & Related papers (2021-05-10T06:23:27Z) - Multi-Attention-Network for Semantic Segmentation of Fine Resolution
Remote Sensing Images [10.835342317692884]
The accuracy of semantic segmentation in remote sensing images has been increased significantly by deep convolutional neural networks.
This paper proposes a Multi-Attention-Network (MANet) to address these issues.
A novel attention mechanism of kernel attention with linear complexity is proposed to alleviate the large computational demand in attention.
arXiv Detail & Related papers (2020-09-03T09:08:02Z) - Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection.
The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.