MQA-KEAL: Multi-hop Question Answering under Knowledge Editing for Arabic Language
- URL: http://arxiv.org/abs/2409.12257v1
- Date: Wed, 18 Sep 2024 18:40:02 GMT
- Title: MQA-KEAL: Multi-hop Question Answering under Knowledge Editing for Arabic Language
- Authors: Muhammad Asif Ali, Nawal Daftardar, Mutayyaba Waheed, Jianbin Qin, Di Wang,
- Abstract summary: We propose Multi-hop Questioning Answering under Knowledge Editing for Arabic Language (MQA-KEAL)
MQA-KEAL stores knowledge edits as structured knowledge units in the external memory.
We also contribute MQUAKE-AR (Arabic translation of English benchmark MQUAKE) as well as a new benchmark MQA-AEVAL for rigorous performance evaluation of MQA under KE for Arabic language.
- Score: 7.488965571323756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have demonstrated significant capabilities across numerous application domains. A key challenge is to keep these models updated with latest available information, which limits the true potential of these models for the end-applications. Although, there have been numerous attempts for LLMs Knowledge Editing (KE), i.e., to edit the LLMs prior knowledge and in turn test it via Multi-hop Question Answering (MQA), yet so far these studies are primarily focused on English language. To bridge this gap, in this paper we propose: Multi-hop Questioning Answering under Knowledge Editing for Arabic Language (MQA-KEAL). MQA-KEAL stores knowledge edits as structured knowledge units in the external memory. In order to solve multi-hop question, it first uses task-decomposition to decompose the question into smaller sub-problems. Later for each sub-problem, it iteratively queries the external memory and/or target LLM in order to generate the final response. In addition, we also contribute MQUAKE-AR (Arabic translation of English benchmark MQUAKE), as well as a new benchmark MQA-AEVAL for rigorous performance evaluation of MQA under KE for Arabic language. Experimentation evaluation reveals MQA-KEAL outperforms the baseline models by a significant margin.
Related papers
- Multi-LLM QA with Embodied Exploration [55.581423861790945]
We investigate the use of Multi-Embodied LLM Explorers (MELE) for question-answering in an unknown environment.
Multiple LLM-based agents independently explore and then answer queries about a household environment.
We analyze different aggregation methods to generate a single, final answer for each query.
arXiv Detail & Related papers (2024-06-16T12:46:40Z) - Multi-hop Question Answering under Temporal Knowledge Editing [9.356343796845662]
Multi-hop question answering (MQA) under knowledge editing (KE) has garnered significant attention in the era of large language models.
Existing models for MQA under KE exhibit poor performance when dealing with questions containing explicit temporal contexts.
We propose TEMPoral knowLEdge augmented Multi-hop Question Answering (TEMPLE-MQA) to address this limitation.
arXiv Detail & Related papers (2024-03-30T23:22:51Z) - Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering [47.199078631274745]
Large Language Models (LLMs) have shown proficiency in question-answering tasks but often struggle to integrate real-time knowledge.
We propose the Retrieval-Augmented model Editing (RAE) framework for multi-hop question answering.
arXiv Detail & Related papers (2024-03-28T17:47:19Z) - Can multiple-choice questions really be useful in detecting the abilities of LLMs? [15.756543037102256]
Multiple-choice questions (MCQs) are widely used in the evaluation of large language models (LLMs)
The misalignment between the task and the evaluation method demands a thoughtful analysis of MCQ's efficacy.
We evaluate nine LLMs on four question-answering (QA) datasets in two languages: Chinese and English.
arXiv Detail & Related papers (2024-03-26T14:43:48Z) - PokeMQA: Programmable knowledge editing for Multi-hop Question Answering [46.80110170981976]
Multi-hop question answering (MQA) is one of the challenging tasks to evaluate machine's comprehension and reasoning abilities.
We propose a framework, Programmable knowledge editing for Multi-hop Question Answering (MQA)
Specifically, we prompt LLMs to decompose knowledge-augmented multi-hop question, while interacting with a detached trainable scope detector to modulate LLMs behavior depending on external conflict signal.
arXiv Detail & Related papers (2023-12-23T08:32:13Z) - FreshLLMs: Refreshing Large Language Models with Search Engine
Augmentation [92.43001160060376]
We study the factuality of large language models (LLMs) in the context of answering questions that test current world knowledge.
We introduce FreshQA, a novel dynamic QA benchmark encompassing a diverse range of question and answer types.
We benchmark a diverse array of both closed and open-source LLMs under a two-mode evaluation procedure that allows us to measure both correctness and hallucination.
Motivated by these results, we present FreshPrompt, a simple few-shot prompting method that substantially boosts the performance of an LLM on FreshQA.
arXiv Detail & Related papers (2023-10-05T00:04:12Z) - MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions [75.21713251369225]
We present a benchmark, MQuAKE, comprising multi-hop questions that assess whether edited models correctly answer questions.
We propose a memory-based approach, MeLLo, which stores all edited facts externally while prompting the language model iteratively to generate answers consistent with the edited facts.
arXiv Detail & Related papers (2023-05-24T06:48:41Z) - Multifaceted Improvements for Conversational Open-Domain Question
Answering [54.913313912927045]
We propose a framework with Multifaceted Improvements for Conversational open-domain Question Answering (MICQA)
Firstly, the proposed KL-divergence based regularization is able to lead to a better question understanding for retrieval and answer reading.
Second, the added post-ranker module can push more relevant passages to the top placements and be selected for reader with a two-aspect constrains.
Third, the well designed curriculum learning strategy effectively narrows the gap between the golden passage settings of training and inference, and encourages the reader to find true answer without the golden passage assistance.
arXiv Detail & Related papers (2022-04-01T07:54:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.