Related papers: Two-Stage Quranic QA via Ensemble Retrieval and Instruction-Tuned Answer Extraction

Two-Stage Quranic QA via Ensemble Retrieval and Instruction-Tuned Answer Extraction

URL: http://arxiv.org/abs/2508.06971v2
Date: Wed, 03 Sep 2025 19:00:02 GMT
Title: Two-Stage Quranic QA via Ensemble Retrieval and Instruction-Tuned Answer Extraction
Authors: Mohamed Basem, Islam Oshallah, Ali Hamdi, Khaled Shaban, Hozaifa Kassab,
Abstract summary: Quranic Question Answering presents unique challenges due to the linguistic complexity of Classical Arabic and the semantic richness of religious texts.<n>We propose a novel two-stage framework that addresses both passage retrieval and answer extraction.<n>Our approach achieves state-of-the-art results on the Quran QA 2023 Shared Task, with a MAP@10 of 0.3128 and MRR@10 of 0.5763 for retrieval, and a pAP@10 of 0.669 for extraction.
Score: 0.4349640169711269
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Quranic Question Answering presents unique challenges due to the linguistic complexity of Classical Arabic and the semantic richness of religious texts. In this paper, we propose a novel two-stage framework that addresses both passage retrieval and answer extraction. For passage retrieval, we ensemble fine-tuned Arabic language models to achieve superior ranking performance. For answer extraction, we employ instruction-tuned large language models with few-shot prompting to overcome the limitations of fine-tuning on small datasets. Our approach achieves state-of-the-art results on the Quran QA 2023 Shared Task, with a MAP@10 of 0.3128 and MRR@10 of 0.5763 for retrieval, and a pAP@10 of 0.669 for extraction, substantially outperforming previous methods. These results demonstrate that combining model ensembling and instruction-tuned language models effectively addresses the challenges of low-resource question answering in specialized domains.

Related papers

Decomposition-Enhanced Training for Post-Hoc Attributions In Language Models [64.49342399229529]
We argue that post-hoc attribution can be reframed as a reasoning problem, where answers are decomposed into constituent units, each tied to specific context.<n>We introduce DecompTune, a post-training method that teaches models to produce answer decompositions as intermediate reasoning steps.<n>Across extensive experiments and ablations, DecompTune substantially improves attribution quality, outperforming prior methods and matching or exceeding state-of-the-art frontier models.
arXiv Detail & Related papers (2025-10-29T17:58:59Z)
Automatic Pronunciation Error Detection and Correction of the Holy Quran's Learners Using Deep Learning [0.0]
We build a 98% automated pipeline to produce high-quality Quranic datasets.<n>We use our custom Quran Phonetic Script to encode Tajweed rules.<n>We release all code, data, and models as open-source.
arXiv Detail & Related papers (2025-08-27T15:28:46Z)
Few-Shot Prompting for Extractive Quranic QA with Instruction-Tuned LLMs [1.0124625066746595]
It addresses challenges related to complex language, unique terminology, and deep meaning in the text.<n>The second uses few-shot prompting with instruction-tuned large language models such as Gemini and DeepSeek.<n>A specialized Arabic prompt framework is developed for span extraction.
arXiv Detail & Related papers (2025-08-08T08:02:59Z)
HeQ: a Large and Diverse Hebrew Reading Comprehension Benchmark [54.73504952691398]
We set out to deliver a Hebrew Machine Reading dataset as extractive Questioning.<n>The morphologically rich nature of Hebrew poses a challenge to this endeavor.<n>We devise a novel set of guidelines, a controlled crowdsourcing protocol, and revised evaluation metrics.
arXiv Detail & Related papers (2025-08-03T15:53:01Z)
Cross-Language Approach for Quranic QA [1.0124625066746595]
The Quranic QA system holds significant importance as it facilitates a deeper understanding of the Quran, a Holy text for over a billion people worldwide.<n>These systems face unique challenges, including the linguistic disparity between questions written in Modern Standard Arabic and answers found in Quranic verses written in Classical Arabic.<n>We adopt a cross-language approach by expanding and enriching the dataset through machine translation to convert Arabic questions into English, paraphrasing questions to create linguistic diversity, and retrieving answers from an English translation of the Quran to align with multilingual training requirements.
arXiv Detail & Related papers (2025-01-29T07:13:27Z)
Optimized Quran Passage Retrieval Using an Expanded QA Dataset and Fine-Tuned Language Models [0.0]
The Qur'an QA 2023 shared task dataset had a limited number of questions with weak model retrieval.<n>The original dataset, which contains 251 questions, was reviewed and expanded to 629 questions with question diversification and reformulation.<n>Experiments fine-tuned transformer models, including AraBERT, RoBERTa, CAMeLBERT, AraELECTRA, and BERT.
arXiv Detail & Related papers (2024-12-16T04:03:58Z)
HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew [12.320161893898735]
HeSum is a benchmark specifically designed for abstractive text summarization in Modern Hebrew.<n>HeSum consists of 10,000 article-summary pairs sourced from Hebrew news websites written by professionals.<n>Linguistic analysis confirms HeSum's high abstractness and unique morphological challenges.
arXiv Detail & Related papers (2024-06-06T09:36:14Z)
From Multiple-Choice to Extractive QA: A Case Study for English and Arabic [51.13706104333848]
We explore the feasibility of repurposing an existing multilingual dataset for a new NLP task.<n>We present annotation guidelines and a parallel EQA dataset for English and Modern Standard Arabic.<n>We aim to help others adapt our approach for the remaining 120 BELEBELE language variants, many of which are deemed under-resourced.
arXiv Detail & Related papers (2024-04-26T11:46:05Z)
SEMQA: Semi-Extractive Multi-Source Question Answering [94.04430035121136]
We introduce a new QA task for answering multi-answer questions by summarizing multiple diverse sources in a semi-extractive fashion. We create the first dataset of this kind, QuoteSum, with human-written semi-extractive answers to natural and generated questions.
arXiv Detail & Related papers (2023-11-08T18:46:32Z)
TCE at Qur'an QA 2022: Arabic Language Question Answering Over Holy Qur'an Using a Post-Processed Ensemble of BERT-based Models [0.0]
Arabic is the language of the Holy Qur'an; the sacred text for 1.8 billion people across the world. We propose an ensemble learning model based on Arabic variants of BERT models. Our system achieves a Partial Reciprocal Rank (pRR) score of 56.6% on the official test set.
arXiv Detail & Related papers (2022-06-03T13:00:48Z)
SeqZero: Few-shot Compositional Semantic Parsing with Sequential Prompts and Zero-shot Models [57.29358388475983]
Recent research showed promising results on combining pretrained language models with canonical utterance. We propose a novel few-shot semantic parsing method -- SeqZero. In particular, SeqZero brings out the merits from both models via ensemble equipped with our proposed constrained rescaling.
arXiv Detail & Related papers (2022-05-15T21:13:15Z)
Joint Passage Ranking for Diverse Multi-Answer Retrieval [56.43443577137929]
We study multi-answer retrieval, an under-explored problem that requires retrieving passages to cover multiple distinct answers for a question. This task requires joint modeling of retrieved passages, as models should not repeatedly retrieve passages containing the same answer at the cost of missing a different valid answer. In this paper, we introduce JPR, a joint passage retrieval model focusing on reranking. To model the joint probability of the retrieved passages, JPR makes use of an autoregressive reranker that selects a sequence of passages, equipped with novel training and decoding algorithms.
arXiv Detail & Related papers (2021-04-17T04:48:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.