A Hybrid Deep Learning Model for Arabic Text Recognition
- URL: http://arxiv.org/abs/2009.01987v1
- Date: Fri, 4 Sep 2020 02:49:17 GMT
- Title: A Hybrid Deep Learning Model for Arabic Text Recognition
- Authors: Mohammad Fasha, Bassam Hammo, Nadim Obeid, Jabir Widian
- Abstract summary: This paper presents a model that can recognize Arabic text that was printed using multiple font types.
The proposed model employs a hybrid DL network that can recognize Arabic printed text without the need for character segmentation.
The model achieved good results in recognizing characters and words and it also achieved promising results in recognizing characters when it was tested on unseen data.
- Score: 2.064612766965483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Arabic text recognition is a challenging task because of the cursive nature
of Arabic writing system, its joint writing scheme, the large number of
ligatures and many other challenges. Deep Learning DL models achieved
significant progress in numerous domains including computer vision and sequence
modelling. This paper presents a model that can recognize Arabic text that was
printed using multiple font types including fonts that mimic Arabic handwritten
scripts. The proposed model employs a hybrid DL network that can recognize
Arabic printed text without the need for character segmentation. The model was
tested on a custom dataset comprised of over two million word samples that were
generated using 18 different Arabic font types. The objective of the testing
process was to assess the model capability in recognizing a diverse set of
Arabic fonts representing a varied cursive styles. The model achieved good
results in recognizing characters and words and it also achieved promising
results in recognizing characters when it was tested on unseen data. The
prepared model, the custom datasets and the toolkit for generating similar
datasets are made publicly available, these tools can be used to prepare models
for recognizing other font types as well as to further extend and enhance the
performance of the proposed model.
Related papers
- GemmAr: Enhancing LLMs Through Arabic Instruction-Tuning [0.0]
We introduce InstAr-500k, a new Arabic instruction dataset created by generating and collecting content.
We assess this dataset by fine-tuning an open-source Gemma-7B model on several downstream tasks to improve its functionality.
Based on multiple evaluations, our fine-tuned model achieves excellent performance on several Arabic NLP benchmarks.
arXiv Detail & Related papers (2024-07-02T10:43:49Z) - Training a Bilingual Language Model by Mapping Tokens onto a Shared
Character Space [2.9914612342004503]
We train a bilingual Arabic-Hebrew language model using a transliterated version of Arabic texts in Hebrew.
We assess the performance of a language model that employs a unified script for both languages, on machine translation.
arXiv Detail & Related papers (2024-02-25T11:26:39Z) - TextDiffuser-2: Unleashing the Power of Language Models for Text
Rendering [118.30923824681642]
TextDiffuser-2 aims to unleash the power of language models for text rendering.
We utilize the language model within the diffusion model to encode the position and texts at the line level.
We conduct extensive experiments and incorporate user studies involving human participants as well as GPT-4V.
arXiv Detail & Related papers (2023-11-28T04:02:40Z) - AceGPT, Localizing Large Language Models in Arabic [73.39989503874634]
The paper proposes a comprehensive solution that includes pre-training with Arabic texts, Supervised Fine-Tuning (SFT) utilizing native Arabic instructions, and GPT-4 responses in Arabic.
The goal is to cultivate culturally cognizant and value-aligned Arabic LLMs capable of accommodating the diverse, application-specific needs of Arabic-speaking communities.
arXiv Detail & Related papers (2023-09-21T13:20:13Z) - Beyond Arabic: Software for Perso-Arabic Script Manipulation [67.31374614549237]
We provide a set of finite-state transducer (FST) components and corresponding utilities for manipulating the writing systems of languages that use the Perso-Arabic script.
The library also provides simple FST-based romanization and transliteration.
arXiv Detail & Related papers (2023-01-26T20:37:03Z) - Design of Arabic Sign Language Recognition Model [0.0]
The model is tested on ArASL 2018, consisting of 54,000 images for 32 alphabet signs gathered from 40 signers.
The future work will be a model that converts Arabic sign language into Arabic text.
arXiv Detail & Related papers (2023-01-06T19:19:25Z) - Huruf: An Application for Arabic Handwritten Character Recognition Using
Deep Learning [0.0]
We propose a lightweight Convolutional Neural Network-based architecture for recognizing Arabic characters and digits.
The proposed pipeline consists of a total of 18 layers containing four layers each for convolution, pooling, batch normalization, dropout, and finally one Global average layer.
The proposed model respectively achieved an accuracy of 96.93% and 99.35% which is comparable to the state-of-the-art and makes it a suitable solution for real-life end-level applications.
arXiv Detail & Related papers (2022-12-16T17:39:32Z) - On Advances in Text Generation from Images Beyond Captioning: A Case
Study in Self-Rationalization [89.94078728495423]
We show that recent advances in each modality, CLIP image representations and scaling of language models, do not consistently improve multimodal self-rationalization of tasks with multimodal inputs.
Our findings call for a backbone modelling approach that can be built on to advance text generation from images and text beyond image captioning.
arXiv Detail & Related papers (2022-05-24T00:52:40Z) - Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction.
Our approach enables us to massively scale up the number of character types we can effectively model.
We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z) - AraELECTRA: Pre-Training Text Discriminators for Arabic Language
Understanding [0.0]
We develop an Arabic language representation model, which we name AraELECTRA.
Our model is pretrained using the replaced token detection objective on large Arabic text corpora.
We show that AraELECTRA outperforms current state-of-the-art Arabic language representation models, given the same pretraining data and with even a smaller model size.
arXiv Detail & Related papers (2020-12-31T09:35:39Z) - Adaptive Text Recognition through Visual Matching [86.40870804449737]
We introduce a new model that exploits the repetitive nature of characters in languages.
By doing this, we turn text recognition into a shape matching problem.
We show that it can handle challenges that traditional architectures are not able to solve without expensive retraining.
arXiv Detail & Related papers (2020-09-14T17:48:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.