Sparse Concept Coded Tetrolet Transform for Unconstrained Odia Character
Recognition
- URL: http://arxiv.org/abs/2004.01551v1
- Date: Fri, 3 Apr 2020 13:20:12 GMT
- Title: Sparse Concept Coded Tetrolet Transform for Unconstrained Odia Character
Recognition
- Authors: Kalyan S Dash, N B Puhan, G Panda
- Abstract summary: We propose a new image representation approach for unconstrained alphanumeric characters using sparse concept coded Tetrolets.
The proposed OCR system is shown to perform better than other sparse based techniques such as PCA, SparsePCA and Slantlet.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature representation in the form of spatio-spectral decomposition is one of
the robust techniques adopted in automatic handwritten character recognition
systems. In this regard, we propose a new image representation approach for
unconstrained handwritten alphanumeric characters using sparse concept coded
Tetrolets. Tetrolets, which does not use fixed dyadic square blocks for
spectral decomposition like conventional wavelets, preserve the localized
variations in handwritings by adopting tetrominoes those capture the shape
geometry. The sparse concept coding of low entropy Tetrolet representation is
found to extract the important hidden information (concept) for superior
pattern discrimination. Large scale experimentation using ten databases in six
different scripts (Bangla, Devanagari, Odia, English, Arabic and Telugu) has
been performed. The proposed feature representation along with standard
classifiers such as random forest, support vector machine (SVM), nearest
neighbor and modified quadratic discriminant function (MQDF) is found to
achieve state-of-the-art recognition performance in all the databases, viz.
99.40% (MNIST); 98.72% and 93.24% (IITBBS); 99.38% and 99.22% (ISI Kolkata).
The proposed OCR system is shown to perform better than other sparse based
techniques such as PCA, SparsePCA and SparseLDA, as well as better than
existing transforms (Wavelet, Slantlet and Stockwell).
Related papers
- Towards Robust Real-Time Scene Text Detection: From Semantic to Instance
Representation Learning [19.856492291263102]
We propose representation learning for real-time scene text detection.
For semantic representation learning, we propose global-dense semantic contrast (GDSC) and top-down modeling (TDM)
With the proposed GDSC and TDM, the encoder network learns stronger representation without introducing any parameters and computations during inference.
The proposed method achieves 87.2% F-measure with 48.2 FPS on Total-Text and 89.6% F-measure with 36.9 FPS on MSRA-TD500.
arXiv Detail & Related papers (2023-08-14T15:14:37Z) - Keyword Spotting Simplified: A Segmentation-Free Approach using
Character Counting and CTC re-scoring [8.6134769826665]
Recent advances in segmentation-free keyword spotting treat this problem w.r.t. as an object detection paradigm.
We propose a novel segmentation-free system that efficiently scans a document image to find rectangular areas that include the query information.
arXiv Detail & Related papers (2023-08-07T12:11:04Z) - A Transformer Architecture for Online Gesture Recognition of
Mathematical Expressions [0.0]
Transformer architecture is shown to provide an end-to-end model for building expression trees from online handwritten gestures corresponding to glyph strokes.
The attention mechanism was successfully used to encode, learn and enforce the underlying syntax of expressions.
For the first time, the encoder is fed with unseen online-temporal data tokens potentially forming an infinitely large vocabulary.
arXiv Detail & Related papers (2022-11-04T17:55:55Z) - Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part.
We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge.
Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z) - Cascaded Asymmetric Local Pattern: A Novel Descriptor for Unconstrained
Facial Image Recognition and Retrieval [20.77994516381]
In this paper a novel hand crafted cascaded asymmetric local pattern (CALP) is proposed for retrieval and recognition facial image.
The proposed encoding scheme has optimum feature length and shows significant improvement in accuracy under environmental and physiological changes in a facial image.
arXiv Detail & Related papers (2022-01-03T08:23:38Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - Multi-Modal Zero-Shot Sign Language Recognition [51.07720650677784]
We propose a multi-modal Zero-Shot Sign Language Recognition model.
A Transformer-based model along with a C3D model is used for hand detection and deep features extraction.
A semantic space is used to map the visual features to the lingual embedding of the class labels.
arXiv Detail & Related papers (2021-09-02T09:10:39Z) - A Multi-Implicit Neural Representation for Fonts [79.6123184198301]
font-specific discontinuities like edges and corners are difficult to represent using neural networks.
We introduce textitmulti-implicits to represent fonts as a permutation-in set of learned implict functions, without losing features.
arXiv Detail & Related papers (2021-06-12T21:40:11Z) - Unsupervised low-rank representations for speech emotion recognition [78.38221758430244]
We examine the use of linear and non-linear dimensionality reduction algorithms for extracting low-rank feature representations for speech emotion recognition.
We report speech emotion recognition (SER) results for learned representations on two databases using different classification methods.
arXiv Detail & Related papers (2021-04-14T18:30:58Z) - Blind Face Restoration via Deep Multi-scale Component Dictionaries [75.02640809505277]
We propose a deep face dictionary network (termed as DFDNet) to guide the restoration process of degraded observations.
DFDNet generates deep dictionaries for perceptually significant face components from high-quality images.
component AdaIN is leveraged to eliminate the style diversity between the input and dictionary features.
arXiv Detail & Related papers (2020-08-02T07:02:07Z) - A Skip-connected Multi-column Network for Isolated Handwritten Bangla
Character and Digit recognition [12.551285203114723]
We have proposed a non-explicit feature extraction method using a multi-scale multi-column skip convolutional neural network.
Our method is evaluated on four publicly available datasets of isolated handwritten Bangla characters and digits.
arXiv Detail & Related papers (2020-04-27T13:18:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.