Structural analysis of Hindi online handwritten characters for character
recognition
- URL: http://arxiv.org/abs/2310.08222v1
- Date: Thu, 12 Oct 2023 11:14:27 GMT
- Title: Structural analysis of Hindi online handwritten characters for character
recognition
- Authors: Anand Sharma (MIET, Meerut), A. G. Ramakrishnan (IISc, Bengaluru)
- Abstract summary: Direction properties of online strokes are used to analyze them in terms of homogeneous regions or sub-strokes.
These properties along with some geometrics are used to extract sub-units from Hindi online handwritten characters.
A method is developed to extract point stroke, clockwise curve stroke, counter-clockwise curve stroke and loop stroke segments as sub-units.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Direction properties of online strokes are used to analyze them in terms of
homogeneous regions or sub-strokes with points satisfying common geometric
properties. Such sub-strokes are called sub-units. These properties are used to
extract sub-units from Hindi ideal online characters. These properties along
with some heuristics are used to extract sub-units from Hindi online
handwritten characters.\\ A method is developed to extract point stroke,
clockwise curve stroke, counter-clockwise curve stroke and loop stroke segments
as sub-units from Hindi online handwritten characters. These extracted
sub-units are close in structure to the sub-units of the corresponding Hindi
online ideal characters.\\ Importance of local representation of online
handwritten characters in terms of sub-units is assessed by training a
classifier with sub-unit level local and character level global features
extracted from characters for character recognition. The classifier has the
recognition accuracy of 93.5\% on the testing set. This accuracy is the highest
when compared with that of the classifiers trained only with global features
extracted from characters in the same training set and evaluated on the same
testing set.\\ Sub-unit extraction algorithm and the sub-unit based character
classifier are tested on Hindi online handwritten character dataset. This
dataset consists of samples from 96 different characters. There are 12832 and
2821 samples in the training and testing sets, respectively.
Related papers
- Bukva: Russian Sign Language Alphabet [75.42794328290088]
This paper investigates the recognition of the Russian fingerspelling alphabet, also known as the Russian Sign Language (RSL) dactyl.
Dactyl is a component of sign languages where distinct hand movements represent individual letters of a written language.
We provide Bukva, the first full-fledged open-source video dataset for RSL dactyl recognition.
arXiv Detail & Related papers (2024-10-11T09:59:48Z) - Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts [65.10991154918737]
This study focuses on the Chu bamboo slip (CBS) script used during the Spring and Autumn and Warring States period (771-256 BCE) in Ancient China.
Our tokenizer first adopts character detection to locate character boundaries, and then conducts character recognition at both the character and sub-character levels.
To support the academic community, we have also assembled the first large-scale dataset of CBSs with over 100K annotated character image scans.
arXiv Detail & Related papers (2024-09-02T07:42:55Z) - A Classifier Using Global Character Level and Local Sub-unit Level
Features for Hindi Online Handwritten Character Recognition [0.0]
A classifier is developed that defines a joint distribution of global character features, number of sub-units and local sub-unit features to model Hindi online handwritten characters.
The developed classifier has the highest accuracy of 93.5% on the testing set compared to that of the classifiers trained on different features extracted from the same training set.
arXiv Detail & Related papers (2023-10-26T04:20:39Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - Charformer: Fast Character Transformers via Gradient-based Subword
Tokenization [50.16128796194463]
We propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model.
We introduce a soft gradient-based subword tokenization module (GBST) that automatically learns latent subword representations from characters.
We additionally introduce Charformer, a deep Transformer model that integrates GBST and operates on the byte level.
arXiv Detail & Related papers (2021-06-23T22:24:14Z) - A complete character recognition and transliteration technique for
Devanagari script [12.208787849155048]
We present a novel technique for automatic transliteration of Devanagari script using character recognition.
One of the first tasks performed to isolate the constituent characters is segmentation.
Devanagari characters are mapped to corresponding roman alphabets in way that resulting roman alphabets have similar pronunciation to source characters.
arXiv Detail & Related papers (2020-09-28T16:43:18Z) - Arabic Handwritten Character Recognition based on Convolution Neural
Networks and Support Vector Machine [0.0]
We present an algorithm for recognizing Arabic letters and characters based on using deep convolution neural networks (DCNN) and support vector machine (SVM)
This paper addresses the problem of recognizing the Arabic handwritten characters by determining the similarity between the input templates and the pre-stored templates.
The experimental results of this work indicate the ability of the proposed algorithm to recognize, identify, and verify the input handwritten Arabic characters.
arXiv Detail & Related papers (2020-09-28T16:18:52Z) - Text-independent writer identification using convolutional neural
network [8.526559246026162]
We propose an end-to-end deep-learning method for text-independent writer identification.
Our method achieved over 91.81% accuracy to classify writers.
arXiv Detail & Related papers (2020-09-10T14:18:03Z) - 2kenize: Tying Subword Sequences for Chinese Script Conversion [54.33749520569979]
We propose a model that can disambiguate between mappings and convert between the two scripts.
Our proposed method outperforms previous Chinese Character conversion approaches by 6 points in accuracy.
arXiv Detail & Related papers (2020-05-07T10:53:05Z) - Neural Computing for Online Arabic Handwriting Character Recognition
using Hard Stroke Features Mining [0.0]
An enhanced method of detecting the desired critical points from vertical and horizontal direction-length of handwriting stroke features of online Arabic script recognition is proposed.
A minimum feature set is extracted from these tokens for classification of characters using a multilayer perceptron with a back-propagation learning algorithm and modified sigmoid function-based activation function.
The proposed method achieves an average accuracy of 98.6% comparable in state of art character recognition techniques.
arXiv Detail & Related papers (2020-05-02T23:17:08Z) - TextScanner: Reading Characters in Order for Robust Scene Text
Recognition [60.04267660533966]
TextScanner is an alternative approach for scene text recognition.
It generates pixel-wise, multi-channel segmentation maps for character class, position and order.
It also adopts RNN for context modeling and performs paralleled prediction for character position and class.
arXiv Detail & Related papers (2019-12-28T07:52:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.