Related papers: A computational system to handle the orthographic layer of tajwid in contemporary Quranic Orthography

A computational system to handle the orthographic layer of tajwid in contemporary Quranic Orthography

URL: http://arxiv.org/abs/2505.11379v1
Date: Fri, 16 May 2025 15:41:51 GMT
Title: A computational system to handle the orthographic layer of tajwid in contemporary Quranic Orthography
Authors: Alicia González Martínez,
Abstract summary: We explore the systematicity of the rules of tajwid, as they are encountered in the Cairo Quran.<n>We develop a python module that can remove or add the orthographic layer of tajwid from a Quranic text in CQO.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Contemporary Quranic Orthography (CQO) relies on a precise system of phonetic notation that can be traced back to the early stages of Islam, when the Quran was mainly oral in nature and the first written renderings of it served as memory aids for this oral tradition. The early systems of diacritical marks created on top of the Quranic Consonantal Text (QCT) motivated the creation and further development of a fine-grained system of phonetic notation that represented tajwid-the rules of recitation. We explored the systematicity of the rules of tajwid, as they are encountered in the Cairo Quran, using a fully and accurately encoded digital edition of the Quranic text. For this purpose, we developed a python module that can remove or add the orthographic layer of tajwid from a Quranic text in CQO. The interesting characteristic of these two sets of rules is that they address the complete Quranic text of the Cairo Quran, so they can be used as precise witnesses to study its phonetic and prosodic processes. From a computational point of view, the text of the Cairo Quran can be used as a linchpin to align and compare Quranic manuscripts, due to its richness and completeness. This will let us create a very powerful framework to work with the Arabic script, not just within an isolated text, but automatically exploring a specific textual phenomenon in other connected manuscripts. Having all the texts mapped among each other can serve as a powerful tool to study the nature of the notation systems of diacritics added to the consonantal skeleton.

Related papers

QuranMorph: Morphologically Annotated Quranic Corpus [0.0]
QuranMorph is a morphologically annotated corpus for the Quran.<n>The lemmatization process utilized lemmas from Qabas, an Arabic lexicographic database.<n>The part-of-speech tagging was performed using the fine-grained SAMA/Qabas tagset.
arXiv Detail & Related papers (2025-06-22T19:34:09Z)
Developing a Mixed-Methods Pipeline for Community-Oriented Digitization of Kwak'wala Legacy Texts [21.21531481916695]
Kwak'wala is an Indigenous language spoken in British Columbia, Canada.<n>Over 11 volumes of the earliest texts created during the collaboration between Franz Boas and George Hunt have been scanned but remain unreadable by machines.<n>We propose using a mix of off-the-shelf OCR methods, language identification, and masking to effectively isolate Kwak'wala text.
arXiv Detail & Related papers (2025-06-02T15:20:09Z)
Zero-Shot Chinese Character Recognition with Hierarchical Multi-Granularity Image-Text Aligning [52.92837273570818]
Chinese characters exhibit unique structures and compositional rules, allowing for the use of fine-grained semantic information in representation.<n>We propose a Hierarchical Multi-Granularity Image-Text Aligning (Hi-GITA) framework based on a contrastive paradigm.<n>Our proposed Hi-GITA outperforms existing zero-shot CCR methods.
arXiv Detail & Related papers (2025-05-30T17:39:14Z)
Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction [73.26364649572237]
Oracle Bone Inscriptions is one of the oldest existing forms of writing in the world. A large number of Oracle Bone Inscriptions (OBI) remain undeciphered, making it one of the global challenges in paleography today. This paper introduces a novel approach, namely Puzzle Pieces Picker (P$3$), to decipher these enigmatic characters through radical reconstruction.
arXiv Detail & Related papers (2024-06-05T07:34:39Z)
HierCode: A Lightweight Hierarchical Codebook for Zero-shot Chinese Text Recognition [47.86479271322264]
We propose HierCode, a novel and lightweight codebook that exploits the innate hierarchical nature of Chinese characters. HierCode employs a multi-hot encoding strategy, leveraging hierarchical binary tree encoding and prototype learning to create distinctive, informative representations for each character. This approach not only facilitates zero-shot recognition of OOV characters by utilizing shared radicals and structures but also excels in line-level recognition tasks by computing similarity with visual features.
arXiv Detail & Related papers (2024-03-20T17:20:48Z)
StyleSpeech: Self-supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis [63.019962126807116]
The expressive quality of synthesized speech for audiobooks is limited by generalized model architecture and unbalanced style distribution. We propose a self-supervised style enhancing method with VQ-VAE-based pre-training for expressive audiobook speech synthesis.
arXiv Detail & Related papers (2023-12-19T14:13:26Z)
Quranic Conversations: Developing a Semantic Search tool for the Quran using Arabic NLP Techniques [0.7673339435080445]
The Holy Book of Quran is believed to be the literal word of God (Allah) as revealed to the Prophet Muhammad (PBUH) over a period of approximately 23 years. It is challenging for Muslims to get all relevant ayahs (verses) pertaining to a matter or inquiry of interest. We developed a Quran semantic search tool which finds the verses pertaining to the user inquiry or prompt.
arXiv Detail & Related papers (2023-11-09T03:14:54Z)
SeqXGPT: Sentence-Level AI-Generated Text Detection [62.3792779440284]
We introduce a sentence-level detection challenge by synthesizing documents polished with large language models (LLMs) We then propose textbfSequence textbfX (Check) textbfGPT, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection.
arXiv Detail & Related papers (2023-10-13T07:18:53Z)
VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement. Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z)
Quran Recitation Recognition using End-to-End Deep Learning [0.0]
The Quran is the holy scripture of Islam, and its recitation is an important aspect of the religion. Recognizing the recitation of the Holy Quran automatically is a challenging task due to its unique rules. We propose a novel end-to-end deep learning model for recognizing the recitation of the Holy Quran.
arXiv Detail & Related papers (2023-05-10T18:40:01Z)
Beyond Arabic: Software for Perso-Arabic Script Manipulation [67.31374614549237]
We provide a set of finite-state transducer (FST) components and corresponding utilities for manipulating the writing systems of languages that use the Perso-Arabic script. The library also provides simple FST-based romanization and transliteration.
arXiv Detail & Related papers (2023-01-26T20:37:03Z)
Style Classification of Rabbinic Literature for Detection of Lost Midrash Tanhuma Material [1.933681537640272]
We propose a system for classification of rabbinic literature based on its style. We show how this method can be applied to uncover lost material from a specific midrash genre.
arXiv Detail & Related papers (2022-11-17T17:45:59Z)
Smartajweed Automatic Recognition of Arabic Quranic Recitation Rules [0.0]
Tajweed is a set of rules to read the Quran in a correct Pronunciation of the letters with all its Qualities, while Reciting the Quran. These characteristics include melodic rules, like where to stop and for how long, when to merge two letters in pronunciation or when to stretch some, or even when to put more strength on some letters over other.
arXiv Detail & Related papers (2020-12-26T11:24:03Z)
Quran Intelligent Ontology Construction Approach Using Association Rules Mining [0.0]
This research project is concerned with the use of association rules to extract the Quran ontology. Our system is based on the combination of statistics and methods to extract semantic and conceptual relations from Quran verses. The Quran concepts will offer a new and powerful representation of Quran knowledge, and the association rules will help to represent the relations between all classes of connected concepts in the Quran.
arXiv Detail & Related papers (2020-08-07T15:48:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.