Multi-Scale Feature Fusion: Learning Better Semantic Segmentation for
Road Pothole Detection
- URL: http://arxiv.org/abs/2112.13082v1
- Date: Fri, 24 Dec 2021 15:07:47 GMT
- Title: Multi-Scale Feature Fusion: Learning Better Semantic Segmentation for
Road Pothole Detection
- Authors: Jiahe Fan, Mohammud J. Bocus, Brett Hosking, Rigen Wu, Yanan Liu,
Sergey Vityazev, Rui Fan
- Abstract summary: This paper presents a novel pothole detection approach based on single-modal semantic segmentation.
It first extracts visual features from input images using a convolutional neural network.
A channel attention module then reweighs the channel features to enhance the consistency of different feature maps.
- Score: 9.356003255288417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel pothole detection approach based on single-modal
semantic segmentation. It first extracts visual features from input images
using a convolutional neural network. A channel attention module then reweighs
the channel features to enhance the consistency of different feature maps.
Subsequently, we employ an atrous spatial pyramid pooling module (comprising of
atrous convolutions in series, with progressive rates of dilation) to integrate
the spatial context information. This helps better distinguish between potholes
and undamaged road areas. Finally, the feature maps in the adjacent layers are
fused using our proposed multi-scale feature fusion module. This further
reduces the semantic gap between different feature channel layers. Extensive
experiments were carried out on the Pothole-600 dataset to demonstrate the
effectiveness of our proposed method. The quantitative comparisons suggest that
our method achieves the state-of-the-art (SoTA) performance on both RGB images
and transformed disparity images, outperforming three SoTA single-modal
semantic segmentation networks.
Related papers
- DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - Semantic Labeling of High Resolution Images Using EfficientUNets and
Transformers [5.177947445379688]
We propose a new segmentation model that combines convolutional neural networks with deep transformers.
Our results demonstrate that the proposed methodology improves segmentation accuracy compared to state-of-the-art techniques.
arXiv Detail & Related papers (2022-06-20T12:03:54Z) - FPS-Net: A Convolutional Fusion Network for Large-Scale LiDAR Point
Cloud Segmentation [30.736361776703568]
Scene understanding based on LiDAR point cloud is an essential task for autonomous cars to drive safely.
Most existing methods simply stack different point attributes/modalities as image channels to increase information capacity.
We design FPS-Net, a convolutional fusion network that exploits the uniqueness and discrepancy among the projected image channels for optimal point cloud segmentation.
arXiv Detail & Related papers (2021-03-01T04:08:28Z) - Bidirectional Multi-scale Attention Networks for Semantic Segmentation
of Oblique UAV Imagery [30.524771772192757]
We propose the novel bidirectional multi-scale attention networks, which fuse features from multiple scales bidirectionally for more adaptive and effective feature extraction.
Our model achieved the state-of-the-art (SOTA) result with a mean intersection over union (mIoU) score of 70.80%.
arXiv Detail & Related papers (2021-02-05T11:02:15Z) - Dual Attention GANs for Semantic Image Synthesis [101.36015877815537]
We propose a novel Dual Attention GAN (DAGAN) to synthesize photo-realistic and semantically-consistent images.
We also propose two novel modules, i.e., position-wise Spatial Attention Module (SAM) and scale-wise Channel Attention Module (CAM)
DAGAN achieves remarkably better results than state-of-the-art methods, while using fewer model parameters.
arXiv Detail & Related papers (2020-08-29T17:49:01Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Deep Convolutional Neural Network for Identifying Seam-Carving Forgery [10.324492319976798]
We propose a convolutional neural network (CNN)-based approach to classifying seam-carving-based image for reduction and expansion.
Our work exhibits state-of-the-art performance in terms of three-class classification (original, seam inserted, and seam removed)
arXiv Detail & Related papers (2020-07-05T17:20:51Z) - Adaptive feature recombination and recalibration for semantic
segmentation with Fully Convolutional Networks [57.64866581615309]
We propose recombination of features and a spatially adaptive recalibration block that is adapted for semantic segmentation with Fully Convolutional Networks.
Results indicate that Recombination and Recalibration improve the results of a competitive baseline, and generalize across three different problems.
arXiv Detail & Related papers (2020-06-19T15:45:03Z) - Attentive CutMix: An Enhanced Data Augmentation Approach for Deep
Learning Based Image Classification [58.20132466198622]
We propose Attentive CutMix, a naturally enhanced augmentation strategy based on CutMix.
In each training iteration, we choose the most descriptive regions based on the intermediate attention maps from a feature extractor.
Our proposed method is simple yet effective, easy to implement and can boost the baseline significantly.
arXiv Detail & Related papers (2020-03-29T15:01:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.