Quranic Conversations: Developing a Semantic Search tool for the Quran
using Arabic NLP Techniques
- URL: http://arxiv.org/abs/2311.05120v1
- Date: Thu, 9 Nov 2023 03:14:54 GMT
- Title: Quranic Conversations: Developing a Semantic Search tool for the Quran
using Arabic NLP Techniques
- Authors: Yasser Shohoud, Maged Shoman, Sarah Abdelazim
- Abstract summary: The Holy Book of Quran is believed to be the literal word of God (Allah) as revealed to the Prophet Muhammad (PBUH) over a period of approximately 23 years.
It is challenging for Muslims to get all relevant ayahs (verses) pertaining to a matter or inquiry of interest.
We developed a Quran semantic search tool which finds the verses pertaining to the user inquiry or prompt.
- Score: 0.7673339435080445
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Holy Book of Quran is believed to be the literal word of God (Allah) as
revealed to the Prophet Muhammad (PBUH) over a period of approximately 23
years. It is the book where God provides guidance on how to live a righteous
and just life, emphasizing principles like honesty, compassion, charity and
justice, as well as providing rules for personal conduct, family matters,
business ethics and much more. However, due to constraints related to the
language and the Quran organization, it is challenging for Muslims to get all
relevant ayahs (verses) pertaining to a matter or inquiry of interest. Hence,
we developed a Quran semantic search tool which finds the verses pertaining to
the user inquiry or prompt. To achieve this, we trained several models on a
large dataset of over 30 tafsirs, where typically each tafsir corresponds to
one verse in the Quran and, using cosine similarity, obtained the tafsir tensor
which is most similar to the prompt tensor of interest, which was then used to
index for the corresponding ayah in the Quran. Using the SNxLM model, we were
able to achieve a cosine similarity score as high as 0.97 which corresponds to
the abdu tafsir for a verse relating to financial matters.
Related papers
- A Benchmark Dataset with Larger Context for Non-Factoid Question Answering over Islamic Text [0.16385815610837165]
We introduce a comprehensive dataset meticulously crafted for Question-Answering purposes within the domain of Quranic Tafsir and Ahadith.
This dataset comprises a robust collection of over 73,000 question-answer pairs, standing as the largest reported dataset in this specialized domain.
While this paper highlights the dataset's contributions, our subsequent human evaluation uncovered critical insights regarding the limitations of existing automatic evaluation techniques.
arXiv Detail & Related papers (2024-09-15T19:50:00Z) - Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents [61.63208012250885]
We formulate recognizing semantic differences as a token-level regression task.
We study three unsupervised approaches that rely on a masked language model.
Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
arXiv Detail & Related papers (2023-05-22T17:58:04Z) - Mispronunciation Detection of Basic Quranic Recitation Rules using Deep
Learning [0.0]
In Islam, readers must apply a set of pronunciation rules called Tajweed rules to recite the Quran.
The number of Tajweed teachers is not enough nowadays for daily recitation practice for every Muslim.
We propose a solution that consists of Mel-Frequency Cepstral Coefficient (MFCC) features with Long Short-Term Memory (LSTM) neural networks which use the time series.
arXiv Detail & Related papers (2023-05-10T19:31:25Z) - Quran Recitation Recognition using End-to-End Deep Learning [0.0]
The Quran is the holy scripture of Islam, and its recitation is an important aspect of the religion.
Recognizing the recitation of the Holy Quran automatically is a challenging task due to its unique rules.
We propose a novel end-to-end deep learning model for recognizing the recitation of the Holy Quran.
arXiv Detail & Related papers (2023-05-10T18:40:01Z) - Towards a Holistic Understanding of Mathematical Questions with
Contrastive Pre-training [65.10741459705739]
We propose a novel contrastive pre-training approach for mathematical question representations, namely QuesCo.
We first design two-level question augmentations, including content-level and structure-level, which generate literally diverse question pairs with similar purposes.
Then, to fully exploit hierarchical information of knowledge concepts, we propose a knowledge hierarchy-aware rank strategy.
arXiv Detail & Related papers (2023-01-18T14:23:29Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic
Dataset for Narrative Comprehension [136.82507046638784]
We introduce FairytaleQA, a dataset focusing on narrative comprehension of kindergarten to eighth-grade students.
FairytaleQA consists of 10,580 explicit and implicit questions derived from 278 children-friendly stories.
arXiv Detail & Related papers (2022-03-26T00:20:05Z) - TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and
Textual Content in Finance [71.76018597965378]
We build a new large-scale Question Answering dataset containing both Tabular And Textual data, named TAT-QA.
We propose a novel QA model termed TAGOP, which is capable of reasoning over both tables and text.
arXiv Detail & Related papers (2021-05-17T06:12:06Z) - Smartajweed Automatic Recognition of Arabic Quranic Recitation Rules [0.0]
Tajweed is a set of rules to read the Quran in a correct Pronunciation of the letters with all its Qualities, while Reciting the Quran.
These characteristics include melodic rules, like where to stop and for how long, when to merge two letters in pronunciation or when to stretch some, or even when to put more strength on some letters over other.
arXiv Detail & Related papers (2020-12-26T11:24:03Z) - Quran Intelligent Ontology Construction Approach Using Association Rules
Mining [0.0]
This research project is concerned with the use of association rules to extract the Quran ontology.
Our system is based on the combination of statistics and methods to extract semantic and conceptual relations from Quran verses.
The Quran concepts will offer a new and powerful representation of Quran knowledge, and the association rules will help to represent the relations between all classes of connected concepts in the Quran.
arXiv Detail & Related papers (2020-08-07T15:48:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.