Qayyem: A Real-time Platform for Scoring Proficiency of Arabic Essays
- URL: http://arxiv.org/abs/2603.01009v1
- Date: Sun, 01 Mar 2026 09:26:47 GMT
- Title: Qayyem: A Real-time Platform for Scoring Proficiency of Arabic Essays
- Authors: Hoor Elbahnasawi, Marwan Sayed, Sohaila Eltanbouly, Fatima Brahamia, Tamer Elsayed,
- Abstract summary: We present Qayyem, a Web-based platform designed to support Arabic AES.<n>Qayyem provides an integrated workflow for assignment creation, batch essay upload, scoring configuration, and per-trait essay evaluation.<n>The platform deploys a number of state-of-the-art Arabic essay scoring models with different effectiveness and efficiency figures.
- Score: 5.404427910866254
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Over the past years, Automated Essay Scoring (AES) systems have gained increasing attention as scalable and consistent solutions for assessing the proficiency of student writing. Despite recent progress, support for Arabic AES remains limited due to linguistic complexity and scarcity of large publicly-available annotated datasets. In this work, we present Qayyem, a Web-based platform designed to support Arabic AES by providing an integrated workflow for assignment creation, batch essay upload, scoring configuration, and per-trait essay evaluation. Qayyem abstracts the technical complexity of interacting with scoring server APIs, allowing instructors to access advanced scoring services through a user-friendly interface. The platform deploys a number of state-of-the-art Arabic essay scoring models with different effectiveness and efficiency figures.
Related papers
- LAILA: A Large Trait-Based Dataset for Arabic Automated Essay Scoring [7.121813878009244]
LAILA is the largest publicly available Arabic AES dataset to date, comprising 7,859 essays annotated with holistic and trait-specific scores on seven dimensions: relevance, organization, vocabulary, style, development, mechanics, and grammar.<n>We detail the dataset design, collection, and annotations, and provide benchmark results using state-of-the-art Arabic and English models in prompt-specific and cross-prompt settings.
arXiv Detail & Related papers (2025-12-30T13:49:52Z) - Enhancing Arabic Automated Essay Scoring with Synthetic Data and Error Injection [10.198081881605226]
Automated Essay Scoring (AES) plays a crucial role in assessing language learners' writing quality, reducing grading workload, and providing real-time feedback.<n>This paper leverages Large Language Models (LLMs) and Transformer models to generate synthetic Arabic essays for AES.<n>We create a dataset of 3,040 annotated essays with errors injected using our two methods.
arXiv Detail & Related papers (2025-03-22T11:54:10Z) - How well can LLMs Grade Essays in Arabic? [3.101490720236325]
This research assesses the effectiveness of large language models (LLMs) in the task of Arabic automated essay scoring (AES) using the AR-AES dataset.<n>It explores various evaluation methodologies, including zero-shot, few-shot in-context learning, and fine-tuning.<n>A mixed-language prompting strategy, integrating English prompts with Arabic content, was implemented to improve model comprehension and performance.
arXiv Detail & Related papers (2025-01-27T21:30:02Z) - Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion [70.23624194206171]
This paper addresses the need for democratizing large language models (LLM) in the Arab world.<n>One practical objective for an Arabic LLM is to utilize an Arabic-specific vocabulary for the tokenizer that could speed up decoding.<n>Inspired by the vocabulary learning during Second Language (Arabic) Acquisition for humans, the released AraLLaMA employs progressive vocabulary expansion.
arXiv Detail & Related papers (2024-12-16T19:29:06Z) - Gazelle: An Instruction Dataset for Arabic Writing Assistance [12.798604366250261]
We present Gazelle, a comprehensive dataset for Arabic writing assistance.
We also offer an evaluation framework designed to enhance Arabic writing assistance tools.
Our findings underscore the need for continuous model training and dataset enrichment.
arXiv Detail & Related papers (2024-10-23T17:51:58Z) - Classifier identification in Ancient Egyptian as a low-resource sequence-labelling task [0.7237827208209208]
Ancient Egyptian (AE) writing system was characterised by widespread use of graphemic classifiers (determinatives)
We implement a series of sequence-labelling neural models, which achieve promising performance despite the modest amount of training data.
We discuss tokenisation and operationalisation issues arising from tackling AE texts and contrast our approach with frequency-based baselines.
arXiv Detail & Related papers (2024-06-29T15:40:25Z) - CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.<n>Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)<n>We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z) - AceGPT, Localizing Large Language Models in Arabic [73.39989503874634]
The paper proposes a comprehensive solution that includes pre-training with Arabic texts, Supervised Fine-Tuning (SFT) utilizing native Arabic instructions, and GPT-4 responses in Arabic.
The goal is to cultivate culturally cognizant and value-aligned Arabic LLMs capable of accommodating the diverse, application-specific needs of Arabic-speaking communities.
arXiv Detail & Related papers (2023-09-21T13:20:13Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - Prompt Agnostic Essay Scorer: A Domain Generalization Approach to
Cross-prompt Automated Essay Scoring [61.21967763569547]
Cross-prompt automated essay scoring (AES) requires the system to use non target-prompt essays to award scores to a target-prompt essay.
This paper introduces Prompt Agnostic Essay Scorer (PAES) for cross-prompt AES.
Our method requires no access to labelled or unlabelled target-prompt data during training and is a single-stage approach.
arXiv Detail & Related papers (2020-08-04T10:17:38Z) - Evaluation Toolkit For Robustness Testing Of Automatic Essay Scoring
Systems [64.4896118325552]
We evaluate the current state-of-the-art AES models using a model adversarial evaluation scheme and associated metrics.
We find that AES models are highly overstable. Even heavy modifications(as much as 25%) with content unrelated to the topic of the questions do not decrease the score produced by the models.
arXiv Detail & Related papers (2020-07-14T03:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.