Classification of Handwritten Names of Cities and Handwritten Text
Recognition using Various Deep Learning Models
- URL: http://arxiv.org/abs/2102.04816v1
- Date: Tue, 9 Feb 2021 13:34:16 GMT
- Title: Classification of Handwritten Names of Cities and Handwritten Text
Recognition using Various Deep Learning Models
- Authors: Daniyar Nurseitov, Kairat Bostanbekov, Maksat Kanatov, Anel Alimova,
Abdelrahman Abdallah, Galymzhan Abdimanap
- Abstract summary: We have tried to describe various approaches and achievements of recent years in the development of handwritten recognition models.
The first model uses deep convolutional neural networks (CNNs) for feature extraction and a fully connected multilayer perceptron neural network (MLP) for word classification.
The second model, called SimpleHTR, uses CNN and recurrent neural network (RNN) layers to extract information from images.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This article discusses the problem of handwriting recognition in Kazakh and
Russian languages. This area is poorly studied since in the literature there
are almost no works in this direction. We have tried to describe various
approaches and achievements of recent years in the development of handwritten
recognition models in relation to Cyrillic graphics. The first model uses deep
convolutional neural networks (CNNs) for feature extraction and a fully
connected multilayer perceptron neural network (MLP) for word classification.
The second model, called SimpleHTR, uses CNN and recurrent neural network (RNN)
layers to extract information from images. We also proposed the Bluechet and
Puchserver models to compare the results. Due to the lack of available open
datasets in Russian and Kazakh languages, we carried out work to collect data
that included handwritten names of countries and cities from 42 different
Cyrillic words, written more than 500 times in different handwriting. We also
used a handwritten database of Kazakh and Russian languages (HKR). This is a
new database of Cyrillic words (not only countries and cities) for the Russian
and Kazakh languages, created by the authors of this work.
Related papers
- Bukva: Russian Sign Language Alphabet [75.42794328290088]
This paper investigates the recognition of the Russian fingerspelling alphabet, also known as the Russian Sign Language (RSL) dactyl.
Dactyl is a component of sign languages where distinct hand movements represent individual letters of a written language.
We provide Bukva, the first full-fledged open-source video dataset for RSL dactyl recognition.
arXiv Detail & Related papers (2024-10-11T09:59:48Z) - Multichannel Attention Networks with Ensembled Transfer Learning to Recognize Bangla Handwritten Charecter [1.5236380958983642]
The study employed a convolutional neural network (CNN) with ensemble transfer learning and a multichannel attention network.
We evaluated the proposed model using the CAMTERdb 3.1.2 data set and achieved 92% accuracy for the raw dataset and 98.00% for the preprocessed dataset.
arXiv Detail & Related papers (2024-08-20T15:51:01Z) - A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus [71.77214818319054]
Natural language inference is a proxy for natural language understanding.
There is no publicly available NLI corpus for the Romanian language.
We introduce the first Romanian NLI corpus (RoNLI) comprising 58K training sentence pairs.
arXiv Detail & Related papers (2024-05-20T08:41:15Z) - NusaWrites: Constructing High-Quality Corpora for Underrepresented and
Extremely Low-Resource Languages [54.808217147579036]
We conduct a case study on Indonesian local languages.
We compare the effectiveness of online scraping, human translation, and paragraph writing by native speakers in constructing datasets.
Our findings demonstrate that datasets generated through paragraph writing by native speakers exhibit superior quality in terms of lexical diversity and cultural content.
arXiv Detail & Related papers (2023-09-19T14:42:33Z) - New Results for the Text Recognition of Arabic Maghrib{\=i} Manuscripts
-- Managing an Under-resourced Script [0.0]
We introduce and assess a new modus operandi for HTR models development and fine-tuning dedicated to the Arabic Maghrib=i scripts.
The comparison between several state-of-the-art HTR models demonstrates the relevance of a word-based neural approach specialized for Arabic.
Results open new perspectives for Arabic scripts processing and more generally for poorly-endowed languages processing.
arXiv Detail & Related papers (2022-11-29T12:21:41Z) - Kurdish Handwritten Character Recognition using Deep Learning Techniques [26.23274417985375]
This paper attempts to design and develop a model that can recognize handwritten characters for Kurdish alphabets using deep learning techniques.
A comprehensive dataset was created for handwritten Kurdish characters, which contains more than 40 thousand images.
The tested results reported a 96% accuracy rate, and training accuracy reported a 97% accuracy rate.
arXiv Detail & Related papers (2022-10-18T16:48:28Z) - RuMedBench: A Russian Medical Language Understanding Benchmark [58.99199480170909]
The paper describes the open Russian medical language understanding benchmark covering several task types.
We prepare the unified format labeling, data split, and evaluation metrics for new tasks.
A single-number metric expresses a model's ability to cope with the benchmark.
arXiv Detail & Related papers (2022-01-17T16:23:33Z) - Learning Contextualised Cross-lingual Word Embeddings and Alignments for
Extremely Low-Resource Languages Using Parallel Corpora [63.5286019659504]
We propose a new approach for learning contextualised cross-lingual word embeddings based on a small parallel corpus.
Our method obtains word embeddings via an LSTM encoder-decoder model that simultaneously translates and reconstructs an input sentence.
arXiv Detail & Related papers (2020-10-27T22:24:01Z) - Attention-based Fully Gated CNN-BGRU for Russian Handwritten Text [0.5371337604556311]
This research approaches the task of handwritten text with attention encoder-decoder networks that are trained on Kazakh and Russian language.
We developed a novel deep neural network model based on Fully Gated CNN, supported by Multiple bidirectional GRU and Attention mechanisms.
Our research is the first work on the HKR dataset and demonstrates state-of-the-art results to most of the other existing models.
arXiv Detail & Related papers (2020-08-12T15:14:47Z) - Soft Gazetteers for Low-Resource Named Entity Recognition [78.00856159473393]
We propose a method of "soft gazetteers" that incorporates ubiquitously available information from English knowledge bases into neural named entity recognition models.
Our experiments on four low-resource languages show an average improvement of 4 points in F1 score.
arXiv Detail & Related papers (2020-05-04T21:58:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.