Related papers: CNN Model & Tuning for Global Road Damage Detection

CNN Model & Tuning for Global Road Damage Detection

URL: http://arxiv.org/abs/2103.09512v1
Date: Wed, 17 Mar 2021 09:01:23 GMT
Title: CNN Model & Tuning for Global Road Damage Detection
Authors: Rahul Vishwakarma and Ravigopal Vennelakanti (Hitachi America Ltd. R&D)
Abstract summary: We assess single and multi-stage network architectures for object detection. Data preparation for provided Road Damage training dataset, captured using smartphone camera from Czech, India and Japan is discussed. We show a mean F1 score of 0.542 on Test2 and 0.536 on Test1 datasets using a multi-stage Faster R-CNN model, with Resnet-50 and ResnetCSP-101 backbones respectively.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: This paper provides a report on our solution including model selection, tuning strategy and results obtained for Global Road Damage Detection Challenge. This Big Data Cup Challenge was held as a part of IEEE International Conference on Big Data 2020. We assess single and multi-stage network architectures for object detection and provide a benchmark using popular state-of-the-art open-source PyTorch frameworks like Detectron2 and Yolov5. Data preparation for provided Road Damage training dataset, captured using smartphone camera from Czech, India and Japan is discussed. We studied the effect of training on a per country basis with respect to a single generalizable model. We briefly describe the tuning strategy for the experiments conducted on two-stage Faster R-CNN with Deep Residual Network (Resnet) and Feature Pyramid Network (FPN) backbone. Additionally, we compare this to a one-stage Yolov5 model with Cross Stage Partial Network (CSPNet) backbone. We show a mean F1 score of 0.542 on Test2 and 0.536 on Test1 datasets using a multi-stage Faster R-CNN model, with Resnet-50 and Resnet-101 backbones respectively. This shows the generalizability of the Resnet-50 model when compared to its more complex counterparts. Experiments were conducted using Google Colab having K80 and a Linux PC with 1080Ti, NVIDIA consumer grade GPU. A PyTorch based Detectron2 code to preprocess, train, test and submit the Avg F1 score to is made available at https://github.com/vishwakarmarhl/rdd2020

Related papers

Distributionally Robust Classification on a Data Budget [26.69877485937123]
We show that standard ResNet-50 trained with the cross-entropy loss on 2.4 million image samples can attain comparable robustness to a CLIP ResNet-50 trained on 400 million samples. This is the first result showing (near) state-of-the-art distributional robustness on limited data budgets.
arXiv Detail & Related papers (2023-08-07T15:30:02Z)
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions [95.94629864981091]
This work presents a new large-scale CNN-based foundation model, termed InternImage, which can obtain the gain from increasing parameters and training data like ViTs. The proposed InternImage reduces the strict inductive bias of traditional CNNs and makes it possible to learn stronger and more robust patterns with large-scale parameters from massive data like ViTs.
arXiv Detail & Related papers (2022-11-10T18:59:04Z)
EAutoDet: Efficient Architecture Search for Object Detection [110.99532343155073]
EAutoDet framework can discover practical backbone and FPN architectures for object detection in 1.4 GPU-days. We propose a kernel reusing technique by sharing the weights of candidate operations on one edge and consolidating them into one convolution. In particular, the discovered architectures surpass state-of-the-art object detection NAS methods and achieve 40.1 mAP with 120 FPS and 49.2 mAP with 41.3 FPS on COCO test-dev set.
arXiv Detail & Related papers (2022-03-21T05:56:12Z)
Machine Learning Models in Stock Market Prediction [0.0]
The paper focuses on predicting the Nifty 50 Index by using 8 Supervised Machine Learning Models. Experiments are based on historical data of Nifty 50 Index of Indian Stock Market from 22nd April, 1996 to 16th April, 2021.
arXiv Detail & Related papers (2022-02-06T10:33:42Z)
Pixel Difference Networks for Efficient Edge Detection [71.03915957914532]
We propose a lightweight yet effective architecture named Pixel Difference Network (PiDiNet) for efficient edge detection. Extensive experiments on BSDS500, NYUD, and Multicue datasets are provided to demonstrate its effectiveness. A faster version of PiDiNet with less than 0.1M parameters can still achieve comparable performance among state of the arts with 200 FPS.
arXiv Detail & Related papers (2021-08-16T10:42:59Z)
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware. The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z)
Road Damage Detection and Classification with Detectron2 and Faster R-CNN [0.0]
We evaluate Detectron2's implementation of Faster R-CNN using different base models and configurations. We also experiment with these approaches using the Global Road Damage Detection Challenge 2020, A Track in the IEEE Big Data 2020 Big Data Cup Challenge dataset.
arXiv Detail & Related papers (2020-10-28T14:53:17Z)
A CNN-LSTM Architecture for Detection of Intracranial Hemorrhage on CT scans [0.3670422696827525]
We propose a novel method that combines a convolutional neural network (CNN) with a long short-term memory (LSTM) mechanism for accurate prediction of intracranial hemorrhage. The CNN plays the role of a slice-wise feature extractor while the LSTM is responsible for linking the features across slices. We validate the method on the recent RSNA Intracranial Hemorrhage Detection challenge and on the CQ500 dataset.
arXiv Detail & Related papers (2020-05-22T04:00:04Z)
Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture. We show consistent improvements in accuracy and learning convergence over the baseline. Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z)
R-FCN: Object Detection via Region-based Fully Convolutional Networks [87.62557357527861]
We present region-based, fully convolutional networks for accurate and efficient object detection. Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
arXiv Detail & Related papers (2016-05-20T15:50:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.