Related papers: Graph Attention Layer Evolves Semantic Segmentation for Road Pothole Detection: A Benchmark and Algorithms

Graph Attention Layer Evolves Semantic Segmentation for Road Pothole Detection: A Benchmark and Algorithms

URL: http://arxiv.org/abs/2109.02711v1
Date: Mon, 6 Sep 2021 19:44:50 GMT
Title: Graph Attention Layer Evolves Semantic Segmentation for Road Pothole Detection: A Benchmark and Algorithms
Authors: Rui Fan, Hengli Wang, Yuan Wang, Ming Liu, Ioannis Pitas
Abstract summary: Existing road pothole detection approaches can be classified as computer vision-based or machine learning-based. The latter approaches generally address road pothole detection using convolutional neural networks (CNNs) in an end-to-end manner. We propose a novel CNN layer, referred to as graph attention layer (GAL), which can be easily deployed in any existing CNN to optimize image feature representations for semantic segmentation.
Score: 34.80667966432126
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing road pothole detection approaches can be classified as computer vision-based or machine learning-based. The former approaches typically employ 2-D image analysis/understanding or 3-D point cloud modeling and segmentation algorithms to detect road potholes from vision sensor data. The latter approaches generally address road pothole detection using convolutional neural networks (CNNs) in an end-to-end manner. However, road potholes are not necessarily ubiquitous and it is challenging to prepare a large well-annotated dataset for CNN training. In this regard, while computer vision-based methods were the mainstream research trend in the past decade, machine learning-based methods were merely discussed. Recently, we published the first stereo vision-based road pothole detection dataset and a novel disparity transformation algorithm, whereby the damaged and undamaged road areas can be highly distinguished. However, there are no benchmarks currently available for state-of-the-art (SoTA) CNNs trained using either disparity images or transformed disparity images. Therefore, in this paper, we first discuss the SoTA CNNs designed for semantic segmentation and evaluate their performance for road pothole detection with extensive experiments. Additionally, inspired by graph neural network (GNN), we propose a novel CNN layer, referred to as graph attention layer (GAL), which can be easily deployed in any existing CNN to optimize image feature representations for semantic segmentation. Our experiments compare GAL-DeepLabv3+, our best-performing implementation, with nine SoTA CNNs on three modalities of training data: RGB images, disparity images, and transformed disparity images. The experimental results suggest that our proposed GAL-DeepLabv3+ achieves the best overall pothole detection accuracy on all training data modalities.

Related papers

Decoupled Mixup for Generalized Visual Recognition [71.13734761715472]
We propose a novel "Decoupled-Mixup" method to train CNN models for visual recognition. Our method decouples each image into discriminative and noise-prone regions, and then heterogeneously combines these regions to train CNN models. Experiment results show the high generalization performance of our method on testing data that are composed of unseen contexts.
arXiv Detail & Related papers (2022-10-26T15:21:39Z)
GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion. In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning. Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z)
A Novel Hand Gesture Detection and Recognition system based on ensemble-based Convolutional Neural Network [3.5665681694253903]
Detection of hand portion has become a challenging task in computer vision and pattern recognition communities. Deep learning algorithm like convolutional neural network (CNN) architecture has become a very popular choice for classification tasks. In this paper, an ensemble of CNN-based approaches is presented to overcome some problems like high variance during prediction, overfitting problem and also prediction errors.
arXiv Detail & Related papers (2022-02-25T06:46:58Z)
Segmentation of Roads in Satellite Images using specially modified U-Net CNNs [0.0]
The aim of this paper is to build an image classifier for satellite images of urban scenes that identifies the portions of the images in which a road is located. Unlike conventional computer vision algorithms, convolutional neural networks (CNNs) provide accurate and reliable results on this task.
arXiv Detail & Related papers (2021-09-29T19:08:32Z)
OSLO: On-the-Sphere Learning for Omnidirectional images and its application to 360-degree image compression [59.58879331876508]
We study the learning of representation models for omnidirectional images and propose to use the properties of HEALPix uniform sampling of the sphere to redefine the mathematical tools used in deep learning models for omnidirectional images. Our proposed on-the-sphere solution leads to a better compression gain that can save 13.7% of the bit rate compared to similar learned models applied to equirectangular images.
arXiv Detail & Related papers (2021-07-19T22:14:30Z)
Image Restoration by Deep Projected GSURE [115.57142046076164]
Ill-posed inverse problems appear in many image processing applications, such as deblurring and super-resolution. We propose a new image restoration framework that is based on minimizing a loss function that includes a "projected-version" of the Generalized SteinUnbiased Risk Estimator (GSURE) and parameterization of the latent image by a CNN.
arXiv Detail & Related papers (2021-02-04T08:52:46Z)
Learning Collision-Free Space Detection from Stereo Images: Homography Matrix Brings Better Data Augmentation [16.99302954185652]
It remains an open challenge to train deep convolutional neural networks (DCNNs) using only a small quantity of training samples. This paper explores an effective training data augmentation approach that can be employed to improve the overall DCNN performance.
arXiv Detail & Related papers (2020-12-14T19:14:35Z)
Towards Dense People Detection with Deep Learning and Depth images [9.376814409561726]
This paper proposes a DNN-based system that detects multiple people from a single depth image. Our neural network processes a depth image and outputs a likelihood map in image coordinates. We show this strategy to be effective, producing networks that generalize to work with scenes different from those used during training.
arXiv Detail & Related papers (2020-07-14T16:43:02Z)
DiResNet: Direction-aware Residual Network for Road Extraction in VHR Remote Sensing Images [12.081877372552606]
We present a direction-aware residual network (DiResNet) that includes three main contributions. The proposed method has advantages in both overall accuracy and F1-score.
arXiv Detail & Related papers (2020-05-14T19:33:21Z)
Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation [51.17232267143098]
We propose a novel system named Disp R-CNN for 3D object detection from stereo images. We use a statistical shape model to generate dense disparity pseudo-ground-truth without the need of LiDAR point clouds. Experiments on the KITTI dataset show that, even when LiDAR ground-truth is not available at training time, Disp R-CNN achieves competitive performance and outperforms previous state-of-the-art methods by 20% in terms of average precision.
arXiv Detail & Related papers (2020-04-07T17:48:45Z)
Saliency Enhancement using Gradient Domain Edges Merging [65.90255950853674]
We develop a method to merge the edges with the saliency maps to improve the performance of the saliency. This leads to our proposed saliency enhancement using edges (SEE) with an average improvement of at least 3.4 times higher on the DUT-OMRON dataset. The SEE algorithm is split into 2 parts, SEE-Pre for preprocessing and SEE-Post pour postprocessing.
arXiv Detail & Related papers (2020-02-11T14:04:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.