PokeMQA: Programmable knowledge editing for Multi-hop Question Answering
- URL: http://arxiv.org/abs/2312.15194v2
- Date: Thu, 15 Feb 2024 03:10:29 GMT
- Title: PokeMQA: Programmable knowledge editing for Multi-hop Question Answering
- Authors: Hengrui Gu, Kaixiong Zhou, Xiaotian Han, Ninghao Liu, Ruobing Wang,
Xin Wang
- Abstract summary: Multi-hop question answering (MQA) is one of the challenging tasks to evaluate machine's comprehension and reasoning abilities.
We propose a framework, Programmable knowledge editing for Multi-hop Question Answering (MQA)
Specifically, we prompt LLMs to decompose knowledge-augmented multi-hop question, while interacting with a detached trainable scope detector to modulate LLMs behavior depending on external conflict signal.
- Score: 46.80110170981976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-hop question answering (MQA) is one of the challenging tasks to
evaluate machine's comprehension and reasoning abilities, where large language
models (LLMs) have widely achieved the human-comparable performance. Due to the
dynamics of knowledge facts in real world, knowledge editing has been explored
to update model with the up-to-date facts while avoiding expensive re-training
or fine-tuning. Starting from the edited fact, the updated model needs to
provide cascading changes in the chain of MQA. The previous art simply adopts a
mix-up prompt to instruct LLMs conducting multiple reasoning tasks
sequentially, including question decomposition, answer generation, and conflict
checking via comparing with edited facts. However, the coupling of these
functionally-diverse reasoning tasks inhibits LLMs' advantages in comprehending
and answering questions while disturbing them with the unskilled task of
conflict checking. We thus propose a framework, Programmable knowledge editing
for Multi-hop Question Answering (PokeMQA), to decouple the jobs. Specifically,
we prompt LLMs to decompose knowledge-augmented multi-hop question, while
interacting with a detached trainable scope detector to modulate LLMs behavior
depending on external conflict signal. The experiments on three LLM backbones
and two benchmark datasets validate our superiority in knowledge editing of
MQA, outperforming all competitors by a large margin in almost all settings and
consistently producing reliable reasoning process.
Related papers
- Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning [68.57166425493283]
Refusal-Aware Instruction Tuning (RAIT) enables Large Language Models (LLMs) to refuse to answer unknown questions.
RAIT modifies training samples based on the correctness of the initial LLM's response.
This crude approach can cause LLMs to excessively refuse answering questions they could have correctly answered.
arXiv Detail & Related papers (2024-10-09T14:12:51Z) - MQA-KEAL: Multi-hop Question Answering under Knowledge Editing for Arabic Language [7.488965571323756]
We propose Multi-hop Questioning Answering under Knowledge Editing for Arabic Language (MQA-KEAL)
MQA-KEAL stores knowledge edits as structured knowledge units in the external memory.
We also contribute MQUAKE-AR (Arabic translation of English benchmark MQUAKE) as well as a new benchmark MQA-AEVAL for rigorous performance evaluation of MQA under KE for Arabic language.
arXiv Detail & Related papers (2024-09-18T18:40:02Z) - LLM-Based Multi-Hop Question Answering with Knowledge Graph Integration in Evolving Environments [35.3938477255058]
This paper introduces Graph Memory-based Editing for Large Language Models (GMeLLo)
GMeLLo merges the explicit knowledge representation of Knowledge Graphs with the linguistic flexibility of Large Language Models.
Our results show that GMeLLo significantly surpasses current state-of-the-art knowledge editing methods in the multi-hop question answering benchmark, MQuAKE.
arXiv Detail & Related papers (2024-08-28T16:15:45Z) - Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing [38.590823330865845]
Large language models (LLMs) face challenges with internal knowledge inaccuracies and outdated information.
Knowledge editing has emerged as a pivotal approach to mitigate these issues.
We propose a novel knowledge editing method that incorporates a Knowledge Erasure mechanism for Large language model Editing (KELE)
arXiv Detail & Related papers (2024-08-22T14:53:33Z) - Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering [47.199078631274745]
Large Language Models (LLMs) have shown proficiency in question-answering tasks but often struggle to integrate real-time knowledge.
We propose the Retrieval-Augmented model Editing (RAE) framework for multi-hop question answering.
arXiv Detail & Related papers (2024-03-28T17:47:19Z) - On the Robustness of Editing Large Language Models [57.477943944826904]
Large language models (LLMs) have played a pivotal role in building communicative AI, yet they encounter the challenge of efficient updates.
This work seeks to understand the strengths and limitations of editing methods, facilitating practical applications of communicative AI.
arXiv Detail & Related papers (2024-02-08T17:06:45Z) - FreshLLMs: Refreshing Large Language Models with Search Engine
Augmentation [92.43001160060376]
We study the factuality of large language models (LLMs) in the context of answering questions that test current world knowledge.
We introduce FreshQA, a novel dynamic QA benchmark encompassing a diverse range of question and answer types.
We benchmark a diverse array of both closed and open-source LLMs under a two-mode evaluation procedure that allows us to measure both correctness and hallucination.
Motivated by these results, we present FreshPrompt, a simple few-shot prompting method that substantially boosts the performance of an LLM on FreshQA.
arXiv Detail & Related papers (2023-10-05T00:04:12Z) - Search-in-the-Chain: Interactively Enhancing Large Language Models with
Search for Knowledge-intensive Tasks [121.74957524305283]
This paper proposes a novel framework named textbfSearch-in-the-Chain (SearChain) for the interaction between Information Retrieval (IR) and Large Language Model (LLM)
Experiments show that SearChain outperforms state-of-the-art baselines on complex knowledge-intensive tasks.
arXiv Detail & Related papers (2023-04-28T10:15:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.