Related papers: An Efficient Language-Independent Multi-Font OCR for Arabic Script

An Efficient Language-Independent Multi-Font OCR for Arabic Script

URL: http://arxiv.org/abs/2009.09115v1
Date: Fri, 18 Sep 2020 22:57:03 GMT
Title: An Efficient Language-Independent Multi-Font OCR for Arabic Script
Authors: Hussein Osman, Karim Zaghw, Mostafa Hazem, Seifeldin Elsehely
Abstract summary: This paper proposes a complete Arabic OCR system that takes a scanned image of Arabic Naskh script as an input and generates a corresponding digital document. This paper also proposes an improved font-independent character algorithm that outperforms the state-of-the-art segmentation algorithms.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Optical Character Recognition (OCR) is the process of extracting digitized text from images of scanned documents. While OCR systems have already matured in many languages, they still have shortcomings in cursive languages with overlapping letters such as the Arabic language. This paper proposes a complete Arabic OCR system that takes a scanned image of Arabic Naskh script as an input and generates a corresponding digital document. Our Arabic OCR system consists of the following modules: Pre-processing, Word-level Feature Extraction, Character Segmentation, Character Recognition, and Post-processing. This paper also proposes an improved font-independent character segmentation algorithm that outperforms the state-of-the-art segmentation algorithms. Lastly, the paper proposes a neural network model for the character recognition task. The system has experimented on several open Arabic corpora datasets with an average character segmentation accuracy 98.06%, character recognition accuracy 99.89%, and overall system accuracy 97.94% achieving outstanding results compared to the state-of-the-art Arabic OCR systems.

Related papers

Zero-Shot Chinese Character Recognition with Hierarchical Multi-Granularity Image-Text Aligning [52.92837273570818]
Chinese characters exhibit unique structures and compositional rules, allowing for the use of fine-grained semantic information in representation.<n>We propose a Hierarchical Multi-Granularity Image-Text Aligning (Hi-GITA) framework based on a contrastive paradigm.<n>Our proposed Hi-GITA outperforms existing zero-shot CCR methods.
arXiv Detail & Related papers (2025-05-30T17:39:14Z)
TFIC: End-to-End Text-Focused Image Compression for Coding for Machines [50.86328069558113]
We present an image compression system designed to retain text-specific features for subsequent Optical Character Recognition (OCR) Our encoding process requires half the time needed by the OCR module, making it especially suitable for devices with limited computational capacity.
arXiv Detail & Related papers (2025-03-25T09:36:13Z)
KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding [24.9462694200992]
KITAB-Bench is a comprehensive Arabic OCR benchmark that fills the gaps in current evaluation systems. Modern vision-language models (such as GPT-4, Gemini, and Qwen) outperform traditional OCR approaches by an average of 60% in Character Error Rate (CER) This work establishes a rigorous evaluation framework that can drive improvements in Arabic document analysis methods.
arXiv Detail & Related papers (2025-02-20T18:41:23Z)
Arabic Handwritten Document OCR Solution with Binarization and Adaptive Scale Fusion Detection [1.1655046053160683]
We present a complete OCR pipeline that starts with line segmentation and Adaptive Scale Fusion techniques to ensure accurate detection of text lines. Our system, trained on the Arabic Multi-Fonts dataset, achieves a Character Recognition Rate (CRR) of 99.20% and a Word Recognition Rate (WRR) of 93.75% on single-word samples containing 7 to 10 characters.
arXiv Detail & Related papers (2024-12-02T15:21:09Z)
Qalam : A Multimodal LLM for Arabic Optical Character and Handwriting Recognition [18.280762424107408]
This study introduces Qalam, a novel foundation model designed for Arabic OCR and HWR. Our model significantly outperforms existing methods, achieving a Word Error Rate (WER) of just 0.80% in HWR tasks and 1.18% in OCR tasks.
arXiv Detail & Related papers (2024-07-18T14:31:09Z)
Symmetrical Linguistic Feature Distillation with CLIP for Scene Text Recognition [77.93678598476149]
We establish a novel Symmetrical Linguistic Feature Distillation framework (named CLIP-OCR) By cascading the CLIP image encoder with the reversed CLIP text encoder, a symmetrical structure is built with an image-to-text feature flow. Extensive experiments demonstrate the effectiveness of CLIP-OCR with 93.8% average accuracy on six popular STR benchmarks.
arXiv Detail & Related papers (2023-10-08T04:00:20Z)
Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning [61.34060587461462]
We propose a two-stage framework for Chinese Text Recognition (CTR) We pre-train a CLIP-like model through aligning printed character images and Ideographic Description Sequences (IDS) This pre-training stage simulates humans recognizing Chinese characters and obtains the canonical representation of each character. The learned representations are employed to supervise the CTR model, such that traditional single-character recognition can be improved to text-line recognition.
arXiv Detail & Related papers (2023-09-03T05:33:16Z)
OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models [122.27878464009181]
We conducted a comprehensive evaluation of Large Multimodal Models, such as GPT4V and Gemini, in various text-related visual tasks. OCRBench contains 29 datasets, making it the most comprehensive OCR evaluation benchmark available.
arXiv Detail & Related papers (2023-05-13T11:28:37Z)
User-Centric Evaluation of OCR Systems for Kwak'wala [92.73847703011353]
We show that utilizing OCR reduces the time spent in the manual transcription of culturally valuable documents by over 50%. Our results demonstrate the potential benefits that OCR tools can have on downstream language documentation and revitalization efforts.
arXiv Detail & Related papers (2023-02-26T21:41:15Z)
Siamese based Neural Network for Offline Writer Identification on word level data [7.747239584541488]
We propose a novel scheme to identify the author of a document based on the input word image. Our method is text independent and does not impose any constraint on the size of the input image under examination.
arXiv Detail & Related papers (2022-11-17T10:01:46Z)
Kurdish Handwritten Character Recognition using Deep Learning Techniques [26.23274417985375]
This paper attempts to design and develop a model that can recognize handwritten characters for Kurdish alphabets using deep learning techniques. A comprehensive dataset was created for handwritten Kurdish characters, which contains more than 40 thousand images. The tested results reported a 96% accuracy rate, and training accuracy reported a 97% accuracy rate.
arXiv Detail & Related papers (2022-10-18T16:48:28Z)
Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents. Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages. We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z)
Arabic Handwritten Character Recognition based on Convolution Neural Networks and Support Vector Machine [0.0]
We present an algorithm for recognizing Arabic letters and characters based on using deep convolution neural networks (DCNN) and support vector machine (SVM) This paper addresses the problem of recognizing the Arabic handwritten characters by determining the similarity between the input templates and the pre-stored templates. The experimental results of this work indicate the ability of the proposed algorithm to recognize, identify, and verify the input handwritten Arabic characters.
arXiv Detail & Related papers (2020-09-28T16:18:52Z)
Neural Computing for Online Arabic Handwriting Character Recognition using Hard Stroke Features Mining [0.0]
An enhanced method of detecting the desired critical points from vertical and horizontal direction-length of handwriting stroke features of online Arabic script recognition is proposed. A minimum feature set is extracted from these tokens for classification of characters using a multilayer perceptron with a back-propagation learning algorithm and modified sigmoid function-based activation function. The proposed method achieves an average accuracy of 98.6% comparable in state of art character recognition techniques.
arXiv Detail & Related papers (2020-05-02T23:17:08Z)
TextScanner: Reading Characters in Order for Robust Scene Text Recognition [60.04267660533966]
TextScanner is an alternative approach for scene text recognition. It generates pixel-wise, multi-channel segmentation maps for character class, position and order. It also adopts RNN for context modeling and performs paralleled prediction for character position and class.
arXiv Detail & Related papers (2019-12-28T07:52:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.