Few-shot Multi-hop Question Answering over Knowledge Base
- URL: http://arxiv.org/abs/2112.11909v1
- Date: Tue, 14 Dec 2021 00:56:54 GMT
- Title: Few-shot Multi-hop Question Answering over Knowledge Base
- Authors: Fan Meihao
- Abstract summary: This paper proposes an efficient pipeline method equipped with a pre-trained language model and a strategy to construct artificial training samples.
We evaluate our model on CCKS 2019 Complex Question Answering via Knowledge Base task and achieves F1-score of 62.55% on the test dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous work on Chinese Knowledge Base Question Answering has been
restricted due to the lack of complex Chinese semantic parsing dataset and the
exponentially growth of searching space with the length of relation paths. This
paper proposes an efficient pipeline method equipped with a pre-trained
language model and a strategy to construct artificial training samples, which
only needs small amount of data but performs well on open-domain complex
Chinese Question Answering task. Besides, By adopting a Beam Search algorithm
based on a language model marking scores for candidate query tuples, we
decelerate the growing relation paths when generating multi-hop query paths.
Finally, we evaluate our model on CCKS2019 Complex Question Answering via
Knowledge Base task and achieves F1-score of 62.55\% on the test dataset.
Moreover when training with only 10\% data, our model can still achieves
F1-score of 58.54\%. The result shows the capability of our model to process
KBQA task and the advantage in few-shot learning.
Related papers
- Few-Shot Data Synthesis for Open Domain Multi-Hop Question Answering [40.86455734818704]
Few-shot learning for open domain multi-hop question answering typically relies on the incontext learning capability of large language models.
We propose a data synthesis framework for multi-hop question answering that requires less than 10 human annotated question answer pairs.
arXiv Detail & Related papers (2023-05-23T04:57:31Z) - PAXQA: Generating Cross-lingual Question Answering Examples at Training
Scale [53.92008514395125]
PAXQA (Projecting annotations for cross-lingual (x) QA) decomposes cross-lingual QA into two stages.
We propose a novel use of lexically-constrained machine translation, in which constrained entities are extracted from the parallel bitexts.
We show that models fine-tuned on these datasets outperform prior synthetic data generation models over several extractive QA datasets.
arXiv Detail & Related papers (2023-04-24T15:46:26Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Calculating Question Similarity is Enough:A New Method for KBQA Tasks [8.056701645706404]
This paper proposes a Corpus Generation - Retrieve Method (CGRM) with Pre-training Language Model (PLM) and Knowledge Graph (KG)
Firstly, based on the mT5 model, we designed two new pre-training tasks: knowledge masked language modeling and question generation based on the paragraph.
Secondly, after preprocessing triples of knowledge graph with a series of rules, the kT5 model generates natural language QA pairs based on processed triples.
arXiv Detail & Related papers (2021-11-15T10:31:46Z) - CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training [21.07506671340319]
We propose a novel question-answering dataset based on the Common Crawl project in this paper.
We extract around 130 million multilingual question-answer pairs, including about 60 million English data-points.
With this previously unseen number of natural QA pairs, we pre-train popular language models to show the potential of large-scale in-domain pre-training for the task of question-answering.
arXiv Detail & Related papers (2021-10-14T21:23:01Z) - RETRONLU: Retrieval Augmented Task-Oriented Semantic Parsing [11.157958012672202]
We are applying retrieval-based modeling ideas to the problem of multi-domain task-oriented semantic parsing.
Our approach, RetroNLU, extends a sequence-to-sequence model architecture with a retrieval component.
We analyze the nearest neighbor retrieval component's quality, model sensitivity and break down the performance for semantic parses of different utterance complexity.
arXiv Detail & Related papers (2021-09-21T19:30:30Z) - An Empirical Study on Few-shot Knowledge Probing for Pretrained Language
Models [54.74525882974022]
We show that few-shot examples can strongly boost the probing performance for both 1-hop and 2-hop relations.
In particular, we find that a simple-yet-effective approach of finetuning the bias vectors in the model outperforms existing prompt-engineering methods.
arXiv Detail & Related papers (2021-09-06T23:29:36Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z) - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation.
We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.