Creation of a Numerical Scoring System to Objectively Measure and Compare the Level of Rhetoric in Arabic Texts: A Feasibility Study, and A Working Prototype
- URL: http://arxiv.org/abs/2507.21106v1
- Date: Tue, 08 Jul 2025 13:01:46 GMT
- Title: Creation of a Numerical Scoring System to Objectively Measure and Compare the Level of Rhetoric in Arabic Texts: A Feasibility Study, and A Working Prototype
- Authors: Mandar Marathe,
- Abstract summary: The aim of this study was to devise a way to measure the density of the literary devices which constitute Arabic rhetoric in a given text.<n>A list of 84 of the commonest literary devices and their definitions was compiled.<n>A method of calculating the density of literary devices based on the morpheme count of the text was utilised.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Arabic Rhetoric is the field of Arabic linguistics which governs the art and science of conveying a message with greater beauty, impact and persuasiveness. The field is as ancient as the Arabic language itself and is found extensively in classical and contemporary Arabic poetry, free verse and prose. In practical terms, it is the intelligent use of word order, figurative speech and linguistic embellishments to enhance message delivery. Despite the volumes that have been written about it and the high status accorded to it, there is no way to objectively know whether a speaker or writer has used Arabic rhetoric in a given text, to what extent, and why. There is no objective way to compare the use of Arabic rhetoric across genres, authors or epochs. It is impossible to know which of pre-Islamic poetry, Andalucian Arabic poetry, or modern literary genres are richer in Arabic rhetoric. The aim of the current study was to devise a way to measure the density of the literary devices which constitute Arabic rhetoric in a given text, as a proxy marker for Arabic rhetoric itself. A comprehensive list of 84 of the commonest literary devices and their definitions was compiled. A system of identifying literary devices in texts was constructed. A method of calculating the density of literary devices based on the morpheme count of the text was utilised. Four electronic tools and an analogue tool were created to support the calculation of an Arabic text's rhetorical literary device density, including a website and online calculator. Additionally, a technique of reporting the distribution of literary devices used across the three sub-domains of Arabic rhetoric was created. The output of this project is a working tool which can accurately report the density of Arabic rhetoric in any Arabic text or speech.
Related papers
- Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs [32.247169514152425]
We introduce emphFann or Flop, the first benchmark designed to assess the comprehension of Arabic poetry by large language models.<n>The benchmark comprises a curated corpus of poems with explanations that assess semantic understanding, metaphor interpretation, prosodic awareness, and cultural context.
arXiv Detail & Related papers (2025-05-23T17:59:29Z) - Poem Meter Classification of Recited Arabic Poetry: Integrating High-Resource Systems for a Low-Resource Task [6.562541994801405]
Arabic poetry has received major attention from linguistics over the decades.<n>Identifying meters for a verse is a lengthy and complicated process.<n>We propose a state-of-the-art framework to identify the poem meters of recited Arabic poetry.
arXiv Detail & Related papers (2025-04-16T15:25:45Z) - Arabizi vs LLMs: Can the Genie Understand the Language of Aladdin? [0.4751886527142778]
Arabizi is a hybrid form of Arabic that incorporates Latin characters and numbers.<n>It poses significant challenges for machine translation due to its lack of formal structure.<n>This research project investigates the model's performance in translating Arabizi into both Modern Standard Arabic and English.
arXiv Detail & Related papers (2025-02-28T11:37:52Z) - Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion [55.27025066199226]
This paper addresses the need for democratizing large language models (LLM) in the Arab world.<n>One practical objective for an Arabic LLM is to utilize an Arabic-specific vocabulary for the tokenizer that could speed up decoding.<n>Inspired by the vocabulary learning during Second Language (Arabic) Acquisition for humans, the released AraLLaMA employs progressive vocabulary expansion.
arXiv Detail & Related papers (2024-12-16T19:29:06Z) - Arabic Text Sentiment Analysis: Reinforcing Human-Performed Surveys with
Wider Topic Analysis [49.1574468325115]
The in-depth study manually analyses 133 ASA papers published in the English language between 2002 and 2020.
The main findings show the different approaches used for ASA: machine learning, lexicon-based and hybrid approaches.
There is a need to develop ASA tools that can be used in industry, as well as in academia, for Arabic text SA.
arXiv Detail & Related papers (2024-03-04T10:37:48Z) - AceGPT, Localizing Large Language Models in Arabic [73.39989503874634]
The paper proposes a comprehensive solution that includes pre-training with Arabic texts, Supervised Fine-Tuning (SFT) utilizing native Arabic instructions, and GPT-4 responses in Arabic.
The goal is to cultivate culturally cognizant and value-aligned Arabic LLMs capable of accommodating the diverse, application-specific needs of Arabic-speaking communities.
arXiv Detail & Related papers (2023-09-21T13:20:13Z) - Graphemic Normalization of the Perso-Arabic Script [47.429213930688086]
This paper documents the challenges that Perso-Arabic presents beyond the best-documented languages.
We focus on the situation in natural language processing (NLP), which is affected by multiple, often neglected, issues.
We evaluate the effects of script normalization on eight languages from diverse language families in the Perso-Arabic script diaspora on machine translation and statistical language modeling tasks.
arXiv Detail & Related papers (2022-10-21T21:59:44Z) - Automatic Dialect Density Estimation for African American English [74.44807604000967]
We explore automatic prediction of dialect density of the African American English (AAE) dialect.
dialect density is defined as the percentage of words in an utterance that contain characteristics of the non-standard dialect.
We show a significant correlation between our predicted and ground truth dialect density measures for AAE speech in this database.
arXiv Detail & Related papers (2022-04-03T01:34:48Z) - Comprehensive Benchmark Datasets for Amharic Scene Text Detection and
Recognition [56.048783994698425]
Ethiopic/Amharic script is one of the oldest African writing systems, which serves at least 23 languages in East Africa.
The Amharic writing system, Abugida, has 282 syllables, 15 punctuation marks, and 20 numerals.
We presented the first comprehensive public datasets named HUST-ART, HUST-AST, ABE, and Tana for Amharic script detection and recognition in the natural scene.
arXiv Detail & Related papers (2022-03-23T03:19:35Z) - Sentiment Analysis in Poems in Misurata Sub-dialect -- A Sentiment
Detection in an Arabic Sub-dialect [0.0]
This study focuses on detecting sentiment in poems written in Misurata Arabic sub-dialect spoken in Libya.
The tools used to detect sentiment from the dataset are Sklearn as well as Mazajak sentiment tool 1.
arXiv Detail & Related papers (2021-09-15T10:42:39Z) - Automatic Arabic Dialect Identification Systems for Written Texts: A
Survey [0.0]
Arabic dialect identification is a specific task of natural language processing, aiming to automatically predict the Arabic dialect of a given text.
In this paper, we present a comprehensive survey of Arabic dialect identification research in written texts.
We review the traditional machine learning methods, deep learning architectures, and complex learning approaches to Arabic dialect identification.
arXiv Detail & Related papers (2020-09-26T15:33:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.