Related papers: Text recognition on images using pre-trained CNN

Text recognition on images using pre-trained CNN

URL: http://arxiv.org/abs/2302.05105v1
Date: Fri, 10 Feb 2023 08:09:51 GMT
Title: Text recognition on images using pre-trained CNN
Authors: Afgani Fajar Rizky, Novanto Yudistira, Edy Santoso
Abstract summary: The recognition is trained by using Chars74K dataset and the best model results then tested on some samples of IIIT-5K-Dataset. The research model has an accuracy of 97.94% for validation data, 98.16% for test data, and 95.62% for the test data from IIIT-5K-Dataset.
Score: 2.191505742658975
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: A text on an image often stores important information and directly carries high level semantics, makes it as important source of information and become a very active research topic. Many studies have shown that the use of CNN-based neural networks is quite effective and accurate for image classification which is the basis of text recognition. It can also be more enhanced by using transfer learning from pre-trained model trained on ImageNet dataset as an initial weight. In this research, the recognition is trained by using Chars74K dataset and the best model results then tested on some samples of IIIT-5K-Dataset. The research results showed that the best accuracy is the model that trained using VGG-16 architecture applied with image transformation of rotation 15{\deg}, image scale of 0.9, and the application of gaussian blur effect. The research model has an accuracy of 97.94% for validation data, 98.16% for test data, and 95.62% for the test data from IIIT-5K-Dataset. Based on these results, it can be concluded that pre-trained CNN can produce good accuracy for text recognition, and the model architecture that used in this study can be used as reference material in the development of text detection systems in the future

Related papers

Optical Character Recognition using Convolutional Neural Networks for Ashokan Brahmi Inscriptions [0.13194391758295113]
The study mainly focuses on three pre-trained CNNs, namely LeNet, VGG-16, and MobileNet. The findings reveal that MobileNet outperforms the other two models in terms of accuracy, achieving a validation accuracy of 95.94% and validation loss of 0.129.
arXiv Detail & Related papers (2024-12-29T09:56:03Z)
NCT-CRC-HE: Not All Histopathological Datasets Are Equally Useful [15.10324445908774]
In this paper, we analyze a popular NCT-CRC-HE-100K colorectal cancer dataset used in numerous prior works. We show that both this dataset and the obtained results may be affected by data-specific biases. We show that even the simplest model using only 3 features per image can demonstrate over 50% accuracy on this 9-class dataset.
arXiv Detail & Related papers (2024-09-17T20:36:03Z)
T-ADAF: Adaptive Data Augmentation Framework for Image Classification Network based on Tensor T-product Operator [0.0]
This paper proposes an Adaptive Data Augmentation Framework based on the tensor T-product Operator. It triples one image data to be trained and gain the result from all these three images together with only less than 0.1% increase in the number of parameters. Numerical experiments show that our data augmentation framework can improve the performance of original neural network model by 2%.
arXiv Detail & Related papers (2023-06-07T08:30:44Z)
Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement [68.44100784364987]
We propose a strategy to improve a dataset once such that the accuracy of any model architecture trained on the reinforced dataset is improved at no additional training cost for users. We create a reinforced version of the ImageNet training dataset, called ImageNet+, as well as reinforced datasets CIFAR-100+, Flowers-102+, and Food-101+. Models trained with ImageNet+ are more accurate, robust, and calibrated, and transfer well to downstream tasks.
arXiv Detail & Related papers (2023-03-15T23:10:17Z)
DeepDC: Deep Distance Correlation as a Perceptual Image Quality Evaluator [53.57431705309919]
ImageNet pre-trained deep neural networks (DNNs) show notable transferability for building effective image quality assessment (IQA) models. We develop a novel full-reference IQA (FR-IQA) model based exclusively on pre-trained DNN features. We conduct comprehensive experiments to demonstrate the superiority of the proposed quality model on five standard IQA datasets.
arXiv Detail & Related papers (2022-11-09T14:57:27Z)
CoV-TI-Net: Transferred Initialization with Modified End Layer for COVID-19 Diagnosis [5.546855806629448]
Transfer learning is a relatively new learning method that has been employed in many sectors to achieve good performance with fewer computations. In this research, the PyTorch pre-trained models (VGG19_bn and WideResNet -101) are applied in the MNIST dataset. The proposed model is developed and verified in the Kaggle notebook, and it reached the outstanding accuracy of 99.77% without taking a huge computational time.
arXiv Detail & Related papers (2022-09-20T08:52:52Z)
Portuguese Man-of-War Image Classification with Convolutional Neural Networks [58.720142291102135]
Portuguese man-of-war (PMW) is a gelatinous organism with long tentacles capable of causing severe burns. This paper reports on the use of convolutional neural networks for recognizing PMW images from the Instagram social media.
arXiv Detail & Related papers (2022-07-04T03:06:45Z)
Classification of EEG Motor Imagery Using Deep Learning for Brain-Computer Interface Systems [79.58173794910631]
A trained T1 class Convolutional Neural Network (CNN) model will be used to examine its ability to successfully identify motor imagery. In theory, and if the model has been trained accurately, it should be able to identify a class and label it accordingly. The CNN model will then be restored and used to try and identify the same class of motor imagery data using much smaller sampled data.
arXiv Detail & Related papers (2022-05-31T17:09:46Z)
Deep Learning Based Classification System For Recognizing Local Spinach [0.0]
A Deep learning method has been used that can automatically identify spinach. Four Convolutional Neural Network (CNN) models were used to classify our spinach. Among those models, VGG16 achieved the highest accuracy of 99.79%.
arXiv Detail & Related papers (2022-01-06T15:10:41Z)
Efficient sign language recognition system and dataset creation method based on deep learning and image processing [0.0]
This work investigates techniques of digital image processing and machine learning that can be used to create a sign language dataset effectively. Different datasets were created to test the hypotheses, containing 14 words used daily and recorded by different smartphones in the RGB color system. We achieved an accuracy of 96.38% on the test set and 81.36% on the validation set containing more challenging conditions.
arXiv Detail & Related papers (2021-03-22T23:36:49Z)
Shape-Texture Debiased Neural Network Training [50.6178024087048]
Convolutional Neural Networks are often biased towards either texture or shape, depending on the training dataset. We develop an algorithm for shape-texture debiased learning. Experiments show that our method successfully improves model performance on several image recognition benchmarks.
arXiv Detail & Related papers (2020-10-12T19:16:12Z)
Radioactive data: tracing through training [130.2266320167683]
We propose a new technique, emphradioactive data, that makes imperceptible changes to this dataset such that any model trained on it will bear an identifiable mark. Given a trained model, our technique detects the use of radioactive data and provides a level of confidence (p-value) Our method is robust to data augmentation and backdoority of deep network optimization.
arXiv Detail & Related papers (2020-02-03T18:41:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.