Related papers: A RAG-based Question Answering System Proposal for Understanding Islam: MufassirQAS LLM

A RAG-based Question Answering System Proposal for Understanding Islam: MufassirQAS LLM

URL: http://arxiv.org/abs/2401.15378v4
Date: Thu, 1 Feb 2024 20:28:11 GMT
Title: A RAG-based Question Answering System Proposal for Understanding Islam: MufassirQAS LLM
Authors: Ahmet Yusuf Alan, Enis Karaarslan, \"Omer Aydin
Abstract summary: This study uses a vector database-based Retrieval Augmented Generation (RAG) approach to enhance the accuracy and transparency of LLMs. We created a database consisting of several open-access books that include Turkish context. MufassirQAS and ChatGPT are also tested with sensitive questions.
Score: 0.34530027457862006
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Challenges exist in learning and understanding religions, such as the complexity and depth of religious doctrines and teachings. Chatbots as question-answering systems can help in solving these challenges. LLM chatbots use NLP techniques to establish connections between topics and accurately respond to complex questions. These capabilities make it perfect for enlightenment on religion as a question-answering chatbot. However, LLMs also tend to generate false information, known as hallucination. Also, the chatbots' responses can include content that insults personal religious beliefs, interfaith conflicts, and controversial or sensitive topics. It must avoid such cases without promoting hate speech or offending certain groups of people or their beliefs. This study uses a vector database-based Retrieval Augmented Generation (RAG) approach to enhance the accuracy and transparency of LLMs. Our question-answering system is called "MufassirQAS". We created a database consisting of several open-access books that include Turkish context. These books contain Turkish translations and interpretations of Islam. This database is utilized to answer religion-related questions and ensure our answers are trustworthy. The relevant part of the dataset, which LLM also uses, is presented along with the answer. We have put careful effort into creating system prompts that give instructions to prevent harmful, offensive, or disrespectful responses to respect people's values and provide reliable results. The system answers and shares additional information, such as the page number from the respective book and the articles referenced for obtaining the information. MufassirQAS and ChatGPT are also tested with sensitive questions. We got better performance with our system. Study and enhancements are still in progress. Results and future works are given.

Related papers

"How Do I ...?": Procedural Questions Predominate Student-LLM Chatbot Conversations [39.146761527401424]
This paper focuses on such student questions from two datasets of distinct learning contexts: formative self-study, and summative assessed coursework.<n>We analysed 6,113 messages from both learning contexts using 11 different Large Language Models (LLM) and three human raters.<n>Results show that 'procedural' questions predominated in both learning contexts, but more so when students prepare for summative assessment.
arXiv Detail & Related papers (2026-02-20T17:27:41Z)
Answering Students' Questions on Course Forums Using Multiple Chain-of-Thought Reasoning and Finetuning RAG-Enabled LLM [0.0]
We propose a question answering system based on large language model with retrieval augmented generation (RAG) method.<n>This work focuses on designing a question answering system with open source Large Language Model (LLM) and fine-tuning it on the relevant course dataset.
arXiv Detail & Related papers (2025-11-13T00:26:37Z)
ELOQ: Resources for Enhancing LLM Detection of Out-of-Scope Questions [52.33835101586687]
We study out-of-scope questions, where the retrieved document appears semantically similar to the question but lacks the necessary information to answer it.<n>We propose a guided hallucination-based approach ELOQ to automatically generate a diverse set of out-of-scope questions from post-cutoff documents.
arXiv Detail & Related papers (2024-10-18T16:11:29Z)
Are LLMs Aware that Some Questions are not Open-ended? [58.93124686141781]
We study whether Large Language Models are aware that some questions have limited answers and need to respond more deterministically. The lack of question awareness in LLMs leads to two phenomena: (1) too casual to answer non-open-ended questions or (2) too boring to answer open-ended questions.
arXiv Detail & Related papers (2024-10-01T06:07:00Z)
What Evidence Do Language Models Find Convincing? [94.90663008214918]
We build a dataset that pairs controversial queries with a series of real-world evidence documents that contain different facts. We use this dataset to perform sensitivity and counterfactual analyses to explore which text features most affect LLM predictions. Overall, we find that current models rely heavily on the relevance of a website to the query, while largely ignoring stylistic features that humans find important.
arXiv Detail & Related papers (2024-02-19T02:15:34Z)
Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves [57.974103113675795]
We present a method named Rephrase and Respond' (RaR) which allows Large Language Models to rephrase and expand questions posed by humans. RaR serves as a simple yet effective prompting method for improving performance. We show that RaR is complementary to the popular Chain-of-Thought (CoT) methods, both theoretically and empirically.
arXiv Detail & Related papers (2023-11-07T18:43:34Z)
Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism [0.0]
Large language models (LLMs) have demonstrated impressive language understanding and generation capabilities. These models are not flawless and often produce responses that contain errors or misinformation. We propose a refusal mechanism that instructs LLMs to refuse to answer challenging questions in order to avoid errors.
arXiv Detail & Related papers (2023-11-02T07:20:49Z)
DELPHI: Data for Evaluating LLMs' Performance in Handling Controversial Issues [3.497021928281132]
Controversy is a reflection of our zeitgeist, and an important aspect to any discourse. The rise of large language models (LLMs) as conversational systems has increased public reliance on these systems for answers to their various questions. We propose a novel construction of a controversial questions dataset, expanding upon the publicly released Quora Question Pairs dataset.
arXiv Detail & Related papers (2023-10-27T13:23:02Z)
QASiNa: Religious Domain Question Answering using Sirah Nabawiyah [0.0]
In Islam we strictly regulate the sources of information and who can give interpretations or tafseer for that sources. The approach used by LLM to generate answers based on its own interpretation is similar to the concept of tafseer. We propose the Question Answering Sirah Nabawiyah (QASiNa) dataset, a novel dataset compiled from Sirah Nabawiyah literatures in Indonesian language.
arXiv Detail & Related papers (2023-10-12T07:52:19Z)
Won't Get Fooled Again: Answering Questions with False Premises [79.8761549830075]
Pre-trained language models (PLMs) have shown unprecedented potential in various fields. PLMs tend to be easily deceived by tricky questions such as "How many eyes does the sun have?" We find that the PLMs already possess the knowledge required to rebut such questions.
arXiv Detail & Related papers (2023-07-05T16:09:21Z)
Question Answering as Programming for Solving Time-Sensitive Questions [84.07553016489769]
Question answering plays a pivotal role in human daily life because it involves our acquisition of knowledge about the world. Recently, Large Language Models (LLMs) have shown remarkable intelligence in question answering. This can be attributed to the LLMs' inability to perform rigorous reasoning based on surface-level text semantics. We propose a novel approach where we reframe the $textbfQ$uestion $textbfA$rogrogering task.
arXiv Detail & Related papers (2023-05-23T16:35:16Z)
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks. This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
Discourse Comprehension: A Question Answering Framework to Represent Sentence Connections [35.005593397252746]
A key challenge in building and evaluating models for discourse comprehension is the lack of annotated data. This paper presents a novel paradigm that enables scalable data collection targeting the comprehension of news documents. The resulting corpus, DCQA, consists of 22,430 question-answer pairs across 607 English documents.
arXiv Detail & Related papers (2021-11-01T04:50:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.