Related papers: PokeMQA: Programmable knowledge editing for Multi-hop Question Answering

PokeMQA: Programmable knowledge editing for Multi-hop Question Answering

URL: http://arxiv.org/abs/2312.15194v2
Date: Thu, 15 Feb 2024 03:10:29 GMT
Title: PokeMQA: Programmable knowledge editing for Multi-hop Question Answering
Authors: Hengrui Gu, Kaixiong Zhou, Xiaotian Han, Ninghao Liu, Ruobing Wang, Xin Wang
Abstract summary: Multi-hop question answering (MQA) is one of the challenging tasks to evaluate machine's comprehension and reasoning abilities. We propose a framework, Programmable knowledge editing for Multi-hop Question Answering (MQA) Specifically, we prompt LLMs to decompose knowledge-augmented multi-hop question, while interacting with a detached trainable scope detector to modulate LLMs behavior depending on external conflict signal.
Score: 46.80110170981976
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-hop question answering (MQA) is one of the challenging tasks to evaluate machine's comprehension and reasoning abilities, where large language models (LLMs) have widely achieved the human-comparable performance. Due to the dynamics of knowledge facts in real world, knowledge editing has been explored to update model with the up-to-date facts while avoiding expensive re-training or fine-tuning. Starting from the edited fact, the updated model needs to provide cascading changes in the chain of MQA. The previous art simply adopts a mix-up prompt to instruct LLMs conducting multiple reasoning tasks sequentially, including question decomposition, answer generation, and conflict checking via comparing with edited facts. However, the coupling of these functionally-diverse reasoning tasks inhibits LLMs' advantages in comprehending and answering questions while disturbing them with the unskilled task of conflict checking. We thus propose a framework, Programmable knowledge editing for Multi-hop Question Answering (PokeMQA), to decouple the jobs. Specifically, we prompt LLMs to decompose knowledge-augmented multi-hop question, while interacting with a detached trainable scope detector to modulate LLMs behavior depending on external conflict signal. The experiments on three LLM backbones and two benchmark datasets validate our superiority in knowledge editing of MQA, outperforming all competitors by a large margin in almost all settings and consistently producing reliable reasoning process.

Related papers

Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation [56.69064935192318]
Multi-hop Question Answering (MHQA) adds layers of complexity to question answering, making it more challenging.<n>This paper explores how Language Models respond to multi-hop questions by permuting search results (retrieved documents) under various configurations.
arXiv Detail & Related papers (2025-05-16T23:29:47Z)
An Entailment Tree Generation Approach for Multimodal Multi-Hop Question Answering with Mixture-of-Experts and Iterative Feedback Mechanism [14.479060028732803]
We argue that the current methods of multi-modal multi-hop question answering still mainly face two challenges. The retrieved evidence containing a large amount of redundant information leads to a significant drop in performance. The reasoning process without interpretable reasoning steps makes the model difficult to discover the logical errors for handling complex questions.
arXiv Detail & Related papers (2024-12-08T05:47:55Z)
Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning [68.57166425493283]
Refusal-Aware Instruction Tuning (RAIT) enables Large Language Models (LLMs) to refuse to answer unknown questions. RAIT modifies training samples based on the correctness of the initial LLM's response. This crude approach can cause LLMs to excessively refuse answering questions they could have correctly answered.
arXiv Detail & Related papers (2024-10-09T14:12:51Z)
MQA-KEAL: Multi-hop Question Answering under Knowledge Editing for Arabic Language [7.488965571323756]
We propose Multi-hop Questioning Answering under Knowledge Editing for Arabic Language (MQA-KEAL) MQA-KEAL stores knowledge edits as structured knowledge units in the external memory. We also contribute MQUAKE-AR (Arabic translation of English benchmark MQUAKE) as well as a new benchmark MQA-AEVAL for rigorous performance evaluation of MQA under KE for Arabic language.
arXiv Detail & Related papers (2024-09-18T18:40:02Z)
LLM-Based Multi-Hop Question Answering with Knowledge Graph Integration in Evolving Environments [35.3938477255058]
This paper introduces Graph Memory-based Editing for Large Language Models (GMeLLo) GMeLLo merges the explicit knowledge representation of Knowledge Graphs with the linguistic flexibility of Large Language Models. Our results show that GMeLLo significantly surpasses current state-of-the-art knowledge editing methods in the multi-hop question answering benchmark, MQuAKE.
arXiv Detail & Related papers (2024-08-28T16:15:45Z)
Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing [38.590823330865845]
Large language models (LLMs) face challenges with internal knowledge inaccuracies and outdated information. Knowledge editing has emerged as a pivotal approach to mitigate these issues. We propose a novel knowledge editing method that incorporates a Knowledge Erasure mechanism for Large language model Editing (KELE)
arXiv Detail & Related papers (2024-08-22T14:53:33Z)
Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering [47.199078631274745]
Large Language Models (LLMs) have shown proficiency in question-answering tasks but often struggle to integrate real-time knowledge. We propose the Retrieval-Augmented model Editing (RAE) framework for multi-hop question answering.
arXiv Detail & Related papers (2024-03-28T17:47:19Z)
On the Robustness of Editing Large Language Models [57.477943944826904]
Large language models (LLMs) have played a pivotal role in building communicative AI, yet they encounter the challenge of efficient updates. This work seeks to understand the strengths and limitations of editing methods, facilitating practical applications of communicative AI.
arXiv Detail & Related papers (2024-02-08T17:06:45Z)
FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation [92.43001160060376]
We study the factuality of large language models (LLMs) in the context of answering questions that test current world knowledge. We introduce FreshQA, a novel dynamic QA benchmark encompassing a diverse range of question and answer types. We benchmark a diverse array of both closed and open-source LLMs under a two-mode evaluation procedure that allows us to measure both correctness and hallucination. Motivated by these results, we present FreshPrompt, a simple few-shot prompting method that substantially boosts the performance of an LLM on FreshQA.
arXiv Detail & Related papers (2023-10-05T00:04:12Z)
Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks [121.74957524305283]
This paper proposes a novel framework named textbfSearch-in-the-Chain (SearChain) for the interaction between Information Retrieval (IR) and Large Language Model (LLM) Experiments show that SearChain outperforms state-of-the-art baselines on complex knowledge-intensive tasks.
arXiv Detail & Related papers (2023-04-28T10:15:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.