CNN Model & Tuning for Global Road Damage Detection
- URL: http://arxiv.org/abs/2103.09512v1
- Date: Wed, 17 Mar 2021 09:01:23 GMT
- Title: CNN Model & Tuning for Global Road Damage Detection
- Authors: Rahul Vishwakarma and Ravigopal Vennelakanti (Hitachi America Ltd.
R&D)
- Abstract summary: We assess single and multi-stage network architectures for object detection.
Data preparation for provided Road Damage training dataset, captured using smartphone camera from Czech, India and Japan is discussed.
We show a mean F1 score of 0.542 on Test2 and 0.536 on Test1 datasets using a multi-stage Faster R-CNN model, with Resnet-50 and ResnetCSP-101 backbones respectively.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper provides a report on our solution including model selection,
tuning strategy and results obtained for Global Road Damage Detection
Challenge. This Big Data Cup Challenge was held as a part of IEEE International
Conference on Big Data 2020. We assess single and multi-stage network
architectures for object detection and provide a benchmark using popular
state-of-the-art open-source PyTorch frameworks like Detectron2 and Yolov5.
Data preparation for provided Road Damage training dataset, captured using
smartphone camera from Czech, India and Japan is discussed. We studied the
effect of training on a per country basis with respect to a single
generalizable model. We briefly describe the tuning strategy for the
experiments conducted on two-stage Faster R-CNN with Deep Residual Network
(Resnet) and Feature Pyramid Network (FPN) backbone. Additionally, we compare
this to a one-stage Yolov5 model with Cross Stage Partial Network (CSPNet)
backbone. We show a mean F1 score of 0.542 on Test2 and 0.536 on Test1 datasets
using a multi-stage Faster R-CNN model, with Resnet-50 and Resnet-101 backbones
respectively. This shows the generalizability of the Resnet-50 model when
compared to its more complex counterparts. Experiments were conducted using
Google Colab having K80 and a Linux PC with 1080Ti, NVIDIA consumer grade GPU.
A PyTorch based Detectron2 code to preprocess, train, test and submit the Avg
F1 score to is made available at https://github.com/vishwakarmarhl/rdd2020
Related papers
- Distributionally Robust Classification on a Data Budget [26.69877485937123]
We show that standard ResNet-50 trained with the cross-entropy loss on 2.4 million image samples can attain comparable robustness to a CLIP ResNet-50 trained on 400 million samples.
This is the first result showing (near) state-of-the-art distributional robustness on limited data budgets.
arXiv Detail & Related papers (2023-08-07T15:30:02Z) - InternImage: Exploring Large-Scale Vision Foundation Models with
Deformable Convolutions [95.94629864981091]
This work presents a new large-scale CNN-based foundation model, termed InternImage, which can obtain the gain from increasing parameters and training data like ViTs.
The proposed InternImage reduces the strict inductive bias of traditional CNNs and makes it possible to learn stronger and more robust patterns with large-scale parameters from massive data like ViTs.
arXiv Detail & Related papers (2022-11-10T18:59:04Z) - EAutoDet: Efficient Architecture Search for Object Detection [110.99532343155073]
EAutoDet framework can discover practical backbone and FPN architectures for object detection in 1.4 GPU-days.
We propose a kernel reusing technique by sharing the weights of candidate operations on one edge and consolidating them into one convolution.
In particular, the discovered architectures surpass state-of-the-art object detection NAS methods and achieve 40.1 mAP with 120 FPS and 49.2 mAP with 41.3 FPS on COCO test-dev set.
arXiv Detail & Related papers (2022-03-21T05:56:12Z) - Machine Learning Models in Stock Market Prediction [0.0]
The paper focuses on predicting the Nifty 50 Index by using 8 Supervised Machine Learning Models.
Experiments are based on historical data of Nifty 50 Index of Indian Stock Market from 22nd April, 1996 to 16th April, 2021.
arXiv Detail & Related papers (2022-02-06T10:33:42Z) - Pixel Difference Networks for Efficient Edge Detection [71.03915957914532]
We propose a lightweight yet effective architecture named Pixel Difference Network (PiDiNet) for efficient edge detection.
Extensive experiments on BSDS500, NYUD, and Multicue datasets are provided to demonstrate its effectiveness.
A faster version of PiDiNet with less than 0.1M parameters can still achieve comparable performance among state of the arts with 200 FPS.
arXiv Detail & Related papers (2021-08-16T10:42:59Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - Road Damage Detection and Classification with Detectron2 and Faster
R-CNN [0.0]
We evaluate Detectron2's implementation of Faster R-CNN using different base models and configurations.
We also experiment with these approaches using the Global Road Damage Detection Challenge 2020, A Track in the IEEE Big Data 2020 Big Data Cup Challenge dataset.
arXiv Detail & Related papers (2020-10-28T14:53:17Z) - A CNN-LSTM Architecture for Detection of Intracranial Hemorrhage on CT
scans [0.3670422696827525]
We propose a novel method that combines a convolutional neural network (CNN) with a long short-term memory (LSTM) mechanism for accurate prediction of intracranial hemorrhage.
The CNN plays the role of a slice-wise feature extractor while the LSTM is responsible for linking the features across slices.
We validate the method on the recent RSNA Intracranial Hemorrhage Detection challenge and on the CQ500 dataset.
arXiv Detail & Related papers (2020-05-22T04:00:04Z) - Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture.
We show consistent improvements in accuracy and learning convergence over the baseline.
Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z) - R-FCN: Object Detection via Region-based Fully Convolutional Networks [87.62557357527861]
We present region-based, fully convolutional networks for accurate and efficient object detection.
Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
arXiv Detail & Related papers (2016-05-20T15:50:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.