Knowledge Graph Question Answering via SPARQL Silhouette Generation
- URL: http://arxiv.org/abs/2109.09475v1
- Date: Mon, 6 Sep 2021 14:55:37 GMT
- Title: Knowledge Graph Question Answering via SPARQL Silhouette Generation
- Authors: Sukannya Purkayastha, Saswati Dana, Dinesh Garg, Dinesh Khandelwal, G
P Shrivatsa Bhargav
- Abstract summary: Knowledge Graph Question Answering (KGQA) has become a prominent area in natural language processing.
We propose a modular two-stage neural architecture to solve the KGQA task.
We show that our method can achieve reasonable performance improving the state-of-art by a margin of 3.72% F1 for the LC-QuAD-1 dataset.
- Score: 18.391235417154498
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge Graph Question Answering (KGQA) has become a prominent area in
natural language processing due to the emergence of large-scale Knowledge
Graphs (KGs). Recently Neural Machine Translation based approaches are gaining
momentum that translates natural language queries to structured query languages
thereby solving the KGQA task. However, most of these methods struggle with
out-of-vocabulary words where test entities and relations are not seen during
training time. In this work, we propose a modular two-stage neural architecture
to solve the KGQA task.
The first stage generates a sketch of the target SPARQL called SPARQL
silhouette for the input question. This comprises of (1) Noise simulator to
facilitate out-of-vocabulary words and to reduce vocabulary size (2) seq2seq
model for text to SPARQL silhouette generation. The second stage is a Neural
Graph Search Module. SPARQL silhouette generated in the first stage is
distilled in the second stage by substituting precise relation in the predicted
structure. We simulate ideal and realistic scenarios by designing a noise
simulator. Experimental results show that the quality of generated SPARQL
silhouette in the first stage is outstanding for the ideal scenarios but for
realistic scenarios (i.e. noisy linker), the quality of the resulting SPARQL
silhouette drops drastically. However, our neural graph search module recovers
it considerably. We show that our method can achieve reasonable performance
improving the state-of-art by a margin of 3.72% F1 for the LC-QuAD-1 dataset.
We believe, our proposed approach is novel and will lead to dynamic KGQA
solutions that are suited for practical applications.
Related papers
- UniOQA: A Unified Framework for Knowledge Graph Question Answering with Large Language Models [4.627548680442906]
OwnThink is the most extensive Chinese open-domain knowledge graph introduced in recent times.
We introduce UniOQA, a unified framework that integrates two parallel approaches to question answering.
UniOQA notably advances SpCQL Logical Accuracy to 21.2% and Execution Accuracy to 54.9%, achieving the new state-of-the-art results on this benchmark.
arXiv Detail & Related papers (2024-06-04T08:36:39Z) - A Copy Mechanism for Handling Knowledge Base Elements in SPARQL Neural
Machine Translation [2.9134135167113433]
We propose to integrate a copy mechanism for neural SPARQL query generation as a way to tackle this issue.
We illustrate our proposal by adding a copy layer and a dynamic knowledge base vocabulary to two Seq2Seq architectures (CNNs and Transformers)
This layer makes the models copy KB elements directly from the questions, instead of generating them.
We evaluate our approach on state-of-the-art datasets, including datasets referencing unknown KB elements and measure the accuracy of the copy-augmented architectures.
arXiv Detail & Related papers (2022-11-18T14:56:35Z) - Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.
Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z) - AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question
Generation from SPARQL [18.019353543946913]
This study investigates the task of knowledge-based question generation (KBQG)
Conventional KBQG works generated questions from fact triples in the knowledge graph, which could not express complex operations like aggregation and comparison in SPARQL.
We propose an auto-prompter trained on large-scale unsupervised data to rephrase SPARQL into NL description.
arXiv Detail & Related papers (2022-08-26T06:53:46Z) - E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language
Understanding and Generation [95.49128988683191]
Sequence-to-sequence (seq2seq) learning is a popular fashion for large-scale pretraining language models.
We propose an encoding-enhanced seq2seq pretraining strategy, namely E2S2.
E2S2 improves the seq2seq models via integrating more efficient self-supervised information into the encoders.
arXiv Detail & Related papers (2022-05-30T08:25:36Z) - DUAL: Textless Spoken Question Answering with Speech Discrete Unit
Adaptive Learning [66.71308154398176]
Spoken Question Answering (SQA) has gained research attention and made remarkable progress in recent years.
Existing SQA methods rely on Automatic Speech Recognition (ASR) transcripts, which are time and cost-prohibitive to collect.
This work proposes an ASR transcript-free SQA framework named Discrete Unit Adaptive Learning (DUAL), which leverages unlabeled data for pre-training and is fine-tuned by the SQA downstream task.
arXiv Detail & Related papers (2022-03-09T17:46:22Z) - Question Answering Infused Pre-training of General-Purpose
Contextualized Representations [70.62967781515127]
We propose a pre-training objective based on question answering (QA) for learning general-purpose contextual representations.
We accomplish this goal by training a bi-encoder QA model, which independently encodes passages and questions, to match the predictions of a more accurate cross-encoder model.
We show large improvements over both RoBERTa-large and previous state-of-the-art results on zero-shot and few-shot paraphrase detection.
arXiv Detail & Related papers (2021-06-15T14:45:15Z) - Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised
Discrete Speech Representations [49.55361944105796]
We present a novel approach to any-to-one (A2O) voice conversion (VC) in a sequence-to-sequence framework.
A2O VC aims to convert any speaker, including those unseen during training, to a fixed target speaker.
arXiv Detail & Related papers (2020-10-23T08:34:52Z) - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation.
We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.