Bridging the KB-Text Gap: Leveraging Structured Knowledge-aware
Pre-training for KBQA
- URL: http://arxiv.org/abs/2308.14436v1
- Date: Mon, 28 Aug 2023 09:22:02 GMT
- Title: Bridging the KB-Text Gap: Leveraging Structured Knowledge-aware
Pre-training for KBQA
- Authors: Guanting Dong, Rumei Li, Sirui Wang, Yupeng Zhang, Yunsen Xian and
Weiran Xu
- Abstract summary: We propose a Structured Knowledge-aware Pre-training method (SKP) to bridge the gap between texts and structured KBs.
In the pre-training stage, we introduce two novel structured knowledge-aware tasks, guiding the model to effectively learn the implicit relationship and better representations of complex subgraphs.
In the downstream KBQA task, we further design an efficient linearization strategy and an interval attention mechanism, which assist the model to better encode complex subgraphs.
- Score: 28.642711264323786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge Base Question Answering (KBQA) aims to answer natural language
questions with factual information such as entities and relations in KBs.
However, traditional Pre-trained Language Models (PLMs) are directly
pre-trained on large-scale natural language corpus, which poses challenges for
them in understanding and representing complex subgraphs in structured KBs. To
bridge the gap between texts and structured KBs, we propose a Structured
Knowledge-aware Pre-training method (SKP). In the pre-training stage, we
introduce two novel structured knowledge-aware tasks, guiding the model to
effectively learn the implicit relationship and better representations of
complex subgraphs. In downstream KBQA task, we further design an efficient
linearization strategy and an interval attention mechanism, which assist the
model to better encode complex subgraphs and shield the interference of
irrelevant subgraphs during reasoning respectively. Detailed experiments and
analyses on WebQSP verify the effectiveness of SKP, especially the significant
improvement in subgraph retrieval (+4.08% H@10).
Related papers
- A Learn-Then-Reason Model Towards Generalization in Knowledge Base Question Answering [17.281005999581865]
Large-scale knowledge bases (KBs) like Freebase and Wikidata house millions of structured knowledge.
Knowledge Base Question Answering (KBQA) provides a user-friendly way to access these valuable KBs via asking natural language questions.
This paper develops KBLLaMA, which follows a learn-then-reason framework to inject new KB knowledge into a large language model for flexible end-to-end KBQA.
arXiv Detail & Related papers (2024-06-20T22:22:41Z) - A Knowledge-Injected Curriculum Pretraining Framework for Question Answering [70.13026036388794]
We propose a general Knowledge-Injected Curriculum Pretraining framework (KICP) to achieve comprehensive KG learning and exploitation for Knowledge-based question answering tasks.
The KI module first injects knowledge into the LM by generating KG-centered pretraining corpus, and generalizes the process into three key steps.
The KA module learns knowledge from the generated corpus with LM equipped with an adapter as well as keeps its original natural language understanding ability.
The CR module follows human reasoning patterns to construct three corpora with increasing difficulties of reasoning, and further trains the LM from easy to hard in a curriculum manner.
arXiv Detail & Related papers (2024-03-11T03:42:03Z) - ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models [19.85526116658481]
We introduce ChatKBQA, a novel and simple generate-then-retrieve KBQA framework.
Experimental results show that ChatKBQA achieves new state-of-the-art performance on standard KBQA datasets.
This work can also be regarded as a new paradigm for combining LLMs with knowledge graphs for interpretable and knowledge-required question answering.
arXiv Detail & Related papers (2023-10-13T09:45:14Z) - Supporting Vision-Language Model Inference with Confounder-pruning Knowledge Prompt [71.77504700496004]
Vision-language models are pre-trained by aligning image-text pairs in a common space to deal with open-set visual concepts.
To boost the transferability of the pre-trained models, recent works adopt fixed or learnable prompts.
However, how and what prompts can improve inference performance remains unclear.
arXiv Detail & Related papers (2022-05-23T07:51:15Z) - Knowledgeable Salient Span Mask for Enhancing Language Models as
Knowledge Base [51.55027623439027]
We develop two solutions to help the model learn more knowledge from unstructured text in a fully self-supervised manner.
To our best knowledge, we are the first to explore fully self-supervised learning of knowledge in continual pre-training.
arXiv Detail & Related papers (2022-04-17T12:33:34Z) - A Systematic Investigation of KB-Text Embedding Alignment at Scale [17.636921566637298]
Knowledge bases (KBs) and text often contain complementary knowledge.
How to jointly embed and reason with both knowledge sources to fully leverage the complementary information is still largely an open problem.
We conduct a large-scale, systematic investigation of aligning KB and text embeddings for joint reasoning.
arXiv Detail & Related papers (2021-06-03T04:14:11Z) - Relational world knowledge representation in contextual language models:
A review [19.176173014629185]
We take a natural language processing perspective to the limitations of knowledge bases (KBs)
We propose a novel taxonomy for relational knowledge representation in contextual language models (LMs)
arXiv Detail & Related papers (2021-04-12T21:50:55Z) - Reasoning Over Virtual Knowledge Bases With Open Predicate Relations [85.19305347984515]
We present the Open Predicate Query Language (OPQL)
OPQL is a method for constructing a virtual Knowledge Base (VKB) trained entirely from text.
We demonstrate that OPQL outperforms prior VKB methods on two different KB reasoning tasks.
arXiv Detail & Related papers (2021-02-14T01:29:54Z) - Question Answering over Knowledge Bases by Leveraging Semantic Parsing
and Neuro-Symbolic Reasoning [73.00049753292316]
We propose a semantic parsing and reasoning-based Neuro-Symbolic Question Answering(NSQA) system.
NSQA achieves state-of-the-art performance on QALD-9 and LC-QuAD 1.0.
arXiv Detail & Related papers (2020-12-03T05:17:55Z) - A Survey on Complex Question Answering over Knowledge Base: Recent
Advances and Challenges [71.4531144086568]
Question Answering (QA) over Knowledge Base (KB) aims to automatically answer natural language questions.
Researchers have shifted their attention from simple questions to complex questions, which require more KB triples and constraint inference.
arXiv Detail & Related papers (2020-07-26T07:13:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.