Exploration of an End-to-End Automatic Number-plate Recognition neural
network for Indian datasets
- URL: http://arxiv.org/abs/2207.06657v1
- Date: Thu, 14 Jul 2022 05:05:18 GMT
- Title: Exploration of an End-to-End Automatic Number-plate Recognition neural
network for Indian datasets
- Authors: Sai Sirisha Nadiminti, Pranav Kant Gaur, Abhilash Bhardwaj
- Abstract summary: We release an expanding dataset presently consisting of 1.5k images and a scalable and reproducible procedure of enhancing this dataset towards development of ANPR solution for Indian conditions.
We report the hindrances in direct reusability of the model provided by the authors of CCPD because of the extreme diversity in Indian number plates and differences in distribution with respect to the CCPD dataset.
An improvement of 42.86% was observed in LP detection after aligning the characteristics of Indian dataset with Chinese dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Indian vehicle number plates have wide variety in terms of size, font, script
and shape. Development of Automatic Number Plate Recognition (ANPR) solutions
is therefore challenging, necessitating a diverse dataset to serve as a
collection of examples. However, a comprehensive dataset of Indian scenario is
missing, thereby, hampering the progress towards publicly available and
reproducible ANPR solutions. Many countries have invested efforts to develop
comprehensive ANPR datasets like Chinese City Parking Dataset (CCPD) for China
and Application-oriented License Plate (AOLP) dataset for US. In this work, we
release an expanding dataset presently consisting of 1.5k images and a scalable
and reproducible procedure of enhancing this dataset towards development of
ANPR solution for Indian conditions. We have leveraged this dataset to explore
an End-to-End (E2E) ANPR architecture for Indian scenario which was originally
proposed for Chinese Vehicle number-plate recognition based on the CCPD
dataset. As we customized the architecture for our dataset, we came across
insights, which we have discussed in this paper. We report the hindrances in
direct reusability of the model provided by the authors of CCPD because of the
extreme diversity in Indian number plates and differences in distribution with
respect to the CCPD dataset. An improvement of 42.86% was observed in LP
detection after aligning the characteristics of Indian dataset with Chinese
dataset. In this work, we have also compared the performance of the E2E
number-plate detection model with YOLOv5 model, pre-trained on COCO dataset and
fine-tuned on Indian vehicle images. Given that the number Indian vehicle
images used for fine-tuning the detection module and yolov5 were same, we
concluded that it is more sample efficient to develop an ANPR solution for
Indian conditions based on COCO dataset rather than CCPD dataset.
Related papers
- RedPajama: an Open Dataset for Training Large Language Models [80.74772646989423]
We identify three core data-related challenges that must be addressed to advance open-source language models.
These include (1) transparency in model development, including the data curation process, (2) access to large quantities of high-quality data, and (3) availability of artifacts and metadata for dataset curation and analysis.
We release RedPajama-V1, an open reproduction of the LLaMA training dataset, and RedPajama-V2, a massive web-only dataset consisting of raw, unfiltered text data together with quality signals and metadata.
arXiv Detail & Related papers (2024-11-19T09:35:28Z) - UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction [93.77809355002591]
We introduce UniTraj, a comprehensive framework that unifies various datasets, models, and evaluation criteria.
We conduct extensive experiments and find that model performance significantly drops when transferred to other datasets.
We provide insights into dataset characteristics to explain these findings.
arXiv Detail & Related papers (2024-03-22T10:36:50Z) - ANNA: A Deep Learning Based Dataset in Heterogeneous Traffic for
Autonomous Vehicles [2.932123507260722]
This study discusses a custom-built dataset that includes some unidentified vehicles in the perspective of Bangladesh.
A dataset validity check was performed by evaluating models using the Intersection Over Union (IOU) metric.
The results demonstrated that the model trained on our custom dataset was more precise and efficient than the models trained on the KITTI or COCO dataset concerning Bangladeshi traffic.
arXiv Detail & Related papers (2024-01-21T01:14:04Z) - D2 Pruning: Message Passing for Balancing Diversity and Difficulty in
Data Pruning [70.98091101459421]
Coreset selection seeks to select a subset of the training data so as to maximize the performance of models trained on this subset, also referred to as coreset.
We propose a novel pruning algorithm, D2 Pruning, that uses forward and reverse message passing over this dataset graph for coreset selection.
Results show that D2 Pruning improves coreset selection over previous state-of-the-art methods for up to 70% pruning rates.
arXiv Detail & Related papers (2023-10-11T23:01:29Z) - IDD-3D: Indian Driving Dataset for 3D Unstructured Road Scenes [79.18349050238413]
Preparation and training of deploy-able deep learning architectures require the models to be suited to different traffic scenarios.
An unstructured and complex driving layout found in several developing countries such as India poses a challenge to these models.
We build a new dataset, IDD-3D, which consists of multi-modal data from multiple cameras and LiDAR sensors with 12k annotated driving LiDAR frames.
arXiv Detail & Related papers (2022-10-23T23:03:17Z) - A scalable pipeline for COVID-19: the case study of Germany, Czechia and
Poland [7.753854979677439]
We have built an operational data store (ODS) using to consolidate datasets from multiple data sources.
The ODS has been built not only to store COVID-19 data from Germany, Czechia, and Poland but also other areas.
The data can then support not only forecasting using a version-controlled ArimaHolt model and other analyses to support decision making, but also risk calculator and apps.
arXiv Detail & Related papers (2022-08-27T05:14:01Z) - On the data requirements of probing [20.965328323152608]
We present a novel method to estimate the required number of data samples for probing datasets.
Our framework helps to systematically construct probing datasets to diagnose neural NLP models.
arXiv Detail & Related papers (2022-02-25T16:27:06Z) - Indian Licence Plate Dataset in the wild [0.5156484100374058]
We present a benchmark model that uses semantic segmentation to solve number plate detection.
We propose a two-stage approach in which the first stage is for localizing the plate, and the second stage is to read the text in cropped plate image.
We tested benchmark object detection and semantic segmentation model, for the second stage, we used lprnet based OCR.
arXiv Detail & Related papers (2021-11-11T05:04:10Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.