Text recognition on images using pre-trained CNN
- URL: http://arxiv.org/abs/2302.05105v1
- Date: Fri, 10 Feb 2023 08:09:51 GMT
- Title: Text recognition on images using pre-trained CNN
- Authors: Afgani Fajar Rizky, Novanto Yudistira, Edy Santoso
- Abstract summary: The recognition is trained by using Chars74K dataset and the best model results then tested on some samples of IIIT-5K-Dataset.
The research model has an accuracy of 97.94% for validation data, 98.16% for test data, and 95.62% for the test data from IIIT-5K-Dataset.
- Score: 2.191505742658975
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: A text on an image often stores important information and directly carries
high level semantics, makes it as important source of information and become a
very active research topic. Many studies have shown that the use of CNN-based
neural networks is quite effective and accurate for image classification which
is the basis of text recognition. It can also be more enhanced by using
transfer learning from pre-trained model trained on ImageNet dataset as an
initial weight. In this research, the recognition is trained by using Chars74K
dataset and the best model results then tested on some samples of
IIIT-5K-Dataset. The research results showed that the best accuracy is the
model that trained using VGG-16 architecture applied with image transformation
of rotation 15{\deg}, image scale of 0.9, and the application of gaussian blur
effect. The research model has an accuracy of 97.94% for validation data,
98.16% for test data, and 95.62% for the test data from IIIT-5K-Dataset. Based
on these results, it can be concluded that pre-trained CNN can produce good
accuracy for text recognition, and the model architecture that used in this
study can be used as reference material in the development of text detection
systems in the future
Related papers
- NCT-CRC-HE: Not All Histopathological Datasets Are Equally Useful [15.10324445908774]
In this paper, we analyze a popular NCT-CRC-HE-100K colorectal cancer dataset used in numerous prior works.
We show that both this dataset and the obtained results may be affected by data-specific biases.
We show that even the simplest model using only 3 features per image can demonstrate over 50% accuracy on this 9-class dataset.
arXiv Detail & Related papers (2024-09-17T20:36:03Z) - T-ADAF: Adaptive Data Augmentation Framework for Image Classification
Network based on Tensor T-product Operator [0.0]
This paper proposes an Adaptive Data Augmentation Framework based on the tensor T-product Operator.
It triples one image data to be trained and gain the result from all these three images together with only less than 0.1% increase in the number of parameters.
Numerical experiments show that our data augmentation framework can improve the performance of original neural network model by 2%.
arXiv Detail & Related papers (2023-06-07T08:30:44Z) - Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness
with Dataset Reinforcement [68.44100784364987]
We propose a strategy to improve a dataset once such that the accuracy of any model architecture trained on the reinforced dataset is improved at no additional training cost for users.
We create a reinforced version of the ImageNet training dataset, called ImageNet+, as well as reinforced datasets CIFAR-100+, Flowers-102+, and Food-101+.
Models trained with ImageNet+ are more accurate, robust, and calibrated, and transfer well to downstream tasks.
arXiv Detail & Related papers (2023-03-15T23:10:17Z) - DeepDC: Deep Distance Correlation as a Perceptual Image Quality
Evaluator [53.57431705309919]
ImageNet pre-trained deep neural networks (DNNs) show notable transferability for building effective image quality assessment (IQA) models.
We develop a novel full-reference IQA (FR-IQA) model based exclusively on pre-trained DNN features.
We conduct comprehensive experiments to demonstrate the superiority of the proposed quality model on five standard IQA datasets.
arXiv Detail & Related papers (2022-11-09T14:57:27Z) - CoV-TI-Net: Transferred Initialization with Modified End Layer for
COVID-19 Diagnosis [5.546855806629448]
Transfer learning is a relatively new learning method that has been employed in many sectors to achieve good performance with fewer computations.
In this research, the PyTorch pre-trained models (VGG19_bn and WideResNet -101) are applied in the MNIST dataset.
The proposed model is developed and verified in the Kaggle notebook, and it reached the outstanding accuracy of 99.77% without taking a huge computational time.
arXiv Detail & Related papers (2022-09-20T08:52:52Z) - Portuguese Man-of-War Image Classification with Convolutional Neural
Networks [58.720142291102135]
Portuguese man-of-war (PMW) is a gelatinous organism with long tentacles capable of causing severe burns.
This paper reports on the use of convolutional neural networks for recognizing PMW images from the Instagram social media.
arXiv Detail & Related papers (2022-07-04T03:06:45Z) - Classification of EEG Motor Imagery Using Deep Learning for
Brain-Computer Interface Systems [79.58173794910631]
A trained T1 class Convolutional Neural Network (CNN) model will be used to examine its ability to successfully identify motor imagery.
In theory, and if the model has been trained accurately, it should be able to identify a class and label it accordingly.
The CNN model will then be restored and used to try and identify the same class of motor imagery data using much smaller sampled data.
arXiv Detail & Related papers (2022-05-31T17:09:46Z) - Deep Learning Based Classification System For Recognizing Local Spinach [0.0]
A Deep learning method has been used that can automatically identify spinach.
Four Convolutional Neural Network (CNN) models were used to classify our spinach.
Among those models, VGG16 achieved the highest accuracy of 99.79%.
arXiv Detail & Related papers (2022-01-06T15:10:41Z) - Efficient sign language recognition system and dataset creation method
based on deep learning and image processing [0.0]
This work investigates techniques of digital image processing and machine learning that can be used to create a sign language dataset effectively.
Different datasets were created to test the hypotheses, containing 14 words used daily and recorded by different smartphones in the RGB color system.
We achieved an accuracy of 96.38% on the test set and 81.36% on the validation set containing more challenging conditions.
arXiv Detail & Related papers (2021-03-22T23:36:49Z) - Shape-Texture Debiased Neural Network Training [50.6178024087048]
Convolutional Neural Networks are often biased towards either texture or shape, depending on the training dataset.
We develop an algorithm for shape-texture debiased learning.
Experiments show that our method successfully improves model performance on several image recognition benchmarks.
arXiv Detail & Related papers (2020-10-12T19:16:12Z) - Radioactive data: tracing through training [130.2266320167683]
We propose a new technique, emphradioactive data, that makes imperceptible changes to this dataset such that any model trained on it will bear an identifiable mark.
Given a trained model, our technique detects the use of radioactive data and provides a level of confidence (p-value)
Our method is robust to data augmentation and backdoority of deep network optimization.
arXiv Detail & Related papers (2020-02-03T18:41:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.