Evaluation and Comparison of Deep Learning Methods for Pavement Crack
Identification with Visual Images
- URL: http://arxiv.org/abs/2112.10390v1
- Date: Mon, 20 Dec 2021 08:23:43 GMT
- Title: Evaluation and Comparison of Deep Learning Methods for Pavement Crack
Identification with Visual Images
- Authors: Kai-Liang Lu
- Abstract summary: pavement crack identification with visual images via deep learning algorithms has the advantages of not being limited by the material of object to be detected.
In the aspect of patch sample classification, the fine-tuned TL models can be equivalent to or even slightly better than the ED models in accuracy.
In the aspect of accurate crack location, both ED and GAN algorithms can achieve pixel-level segmentation and is expected to be detected in real time on low computing power platform.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Compared with contact detection techniques, pavement crack identification
with visual images via deep learning algorithms has the advantages of not being
limited by the material of object to be detected, fast speed and low cost. The
fundamental frameworks and typical model architectures of transfer learning
(TL), encoder-decoder (ED), generative adversarial networks (GAN), and their
common modules were first reviewed, and then the evolution of convolutional
neural network (CNN) backbone models and GAN models were summarized. The crack
classification, segmentation performance, and effect were tested on the
SDNET2018 and CFD public data sets. In the aspect of patch sample
classification, the fine-tuned TL models can be equivalent to or even slightly
better than the ED models in accuracy, and the predicting time is faster; In
the aspect of accurate crack location, both ED and GAN algorithms can achieve
pixel-level segmentation and is expected to be detected in real time on low
computing power platform. Furthermore, a weakly supervised learning framework
of combined TL-SSGAN and its performance enhancement measures are proposed,
which can maintain comparable crack identification performance with that of the
supervised learning, while greatly reducing the number of labeled samples
required.
Related papers
- DA-Flow: Dual Attention Normalizing Flow for Skeleton-based Video Anomaly Detection [52.74152717667157]
We propose a lightweight module called Dual Attention Module (DAM) for capturing cross-dimension interaction relationships in-temporal skeletal data.
It employs the frame attention mechanism to identify the most significant frames and the skeleton attention mechanism to capture broader relationships across fixed partitions with minimal parameters and flops.
arXiv Detail & Related papers (2024-06-05T06:18:03Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe.
GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z) - Hybridization of Capsule and LSTM Networks for unsupervised anomaly
detection on multivariate data [0.0]
This paper introduces a novel NN architecture which hybridises the Long-Short-Term-Memory (LSTM) and Capsule Networks into a single network.
The proposed method uses an unsupervised learning technique to overcome the issues with finding large volumes of labelled training data.
arXiv Detail & Related papers (2022-02-11T10:33:53Z) - Cross-modal Knowledge Distillation for Vision-to-Sensor Action
Recognition [12.682984063354748]
This study introduces an end-to-end Vision-to-Sensor Knowledge Distillation (VSKD) framework.
In this VSKD framework, only time-series data, i.e., accelerometer data, is needed from wearable devices during the testing phase.
This framework will not only reduce the computational demands on edge devices, but also produce a learning model that closely matches the performance of the computational expensive multi-modal approach.
arXiv Detail & Related papers (2021-10-08T15:06:38Z) - Robust Pooling through the Data Mode [5.7564383437854625]
This paper proposes a novel deep learning solution that includes a novel robust pooling layer.
The proposed pooling layer looks for data a mode/cluster using two methods, RANSAC, and histogram, as clusters are indicative of models.
We tested the pooling layer into frameworks such as Point-based and graph-based neural networks, and the tests showed enhanced robustness as compared to robust state-of-the-art methods.
arXiv Detail & Related papers (2021-06-21T04:35:24Z) - Automatic Visual Inspection of Rare Defects: A Framework based on
GP-WGAN and Enhanced Faster R-CNN [0.0]
This paper proposes a two-staged fault diagnosis framework for Automatic Visual Inspection (AVI) systems.
In the first stage, a generation model is designed to synthesize new samples based on real samples.
The proposed augmentation algorithm extracts objects from the real samples and blends them randomly, to generate new samples and enhance the performance of the image processor.
arXiv Detail & Related papers (2021-05-02T11:34:59Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Sparse Signal Models for Data Augmentation in Deep Learning ATR [0.8999056386710496]
We propose a data augmentation approach to incorporate domain knowledge and improve the generalization power of a data-intensive learning algorithm.
We exploit the sparsity of the scattering centers in the spatial domain and the smoothly-varying structure of the scattering coefficients in the azimuthal domain to solve the ill-posed problem of over-parametrized model fitting.
arXiv Detail & Related papers (2020-12-16T21:46:33Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.