Modern Baselines for SPARQL Semantic Parsing
- URL: http://arxiv.org/abs/2204.12793v3
- Date: Thu, 14 Sep 2023 08:50:25 GMT
- Title: Modern Baselines for SPARQL Semantic Parsing
- Authors: Debayan Banerjee, Pranav Ajit Nair, Jivat Neet Kaur, Ricardo Usbeck,
Chris Biemann
- Abstract summary: We focus on the task of generating SPARQL queries from natural language questions, which can then be executed on Knowledge Graphs.
We show that T5 requires special input tokenisation, but produces state of the art performance on LC-QuAD 1.0 and LC-QuAD 2.0 datasets.
The methods enable semantic parsing for questions where a part of the input needs to be copied to the output query, thus enabling a new paradigm in KG semantic parsing.
- Score: 28.088516108293653
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we focus on the task of generating SPARQL queries from natural
language questions, which can then be executed on Knowledge Graphs (KGs). We
assume that gold entity and relations have been provided, and the remaining
task is to arrange them in the right order along with SPARQL vocabulary, and
input tokens to produce the correct SPARQL query. Pre-trained Language Models
(PLMs) have not been explored in depth on this task so far, so we experiment
with BART, T5 and PGNs (Pointer Generator Networks) with BERT embeddings,
looking for new baselines in the PLM era for this task, on DBpedia and Wikidata
KGs. We show that T5 requires special input tokenisation, but produces state of
the art performance on LC-QuAD 1.0 and LC-QuAD 2.0 datasets, and outperforms
task-specific models from previous works. Moreover, the methods enable semantic
parsing for questions where a part of the input needs to be copied to the
output query, thus enabling a new paradigm in KG semantic parsing.
Related papers
- Effective Instruction Parsing Plugin for Complex Logical Query Answering on Knowledge Graphs [51.33342412699939]
Knowledge Graph Query Embedding (KGQE) aims to embed First-Order Logic (FOL) queries in a low-dimensional KG space for complex reasoning over incomplete KGs.
Recent studies integrate various external information (such as entity types and relation context) to better capture the logical semantics of FOL queries.
We propose an effective Query Instruction Parsing (QIPP) that captures latent query patterns from code-like query instructions.
arXiv Detail & Related papers (2024-10-27T03:18:52Z) - Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models.
Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models.
Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z) - Enhancing SPARQL Generation by Triplet-order-sensitive Pre-training [13.57710774520144]
We propose an additional pre-training stage with a new objective, Triplet Order Correction (TOC), along with the commonly used Masked Language Modeling (MLM)
Our method achieves state-of-the-art performances on three widely-used benchmarks.
arXiv Detail & Related papers (2024-10-08T06:48:46Z) - The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing [20.734859343886843]
We analyse the role of output vocabulary for text-to-text (T2T) models on the task of SPARQL semantic parsing.
We carry out carefully selected vocabulary substitutions on the queries and find absolute gains in the range of 17% on the GrailQA dataset.
arXiv Detail & Related papers (2023-05-24T12:55:04Z) - GETT-QA: Graph Embedding based T2T Transformer for Knowledge Graph
Question Answering [20.734859343886843]
We present an end-to-end Knowledge Graph Question Answering system named GETT-QA.
GETT-QA uses T5, a popular text-to-text pre-trained language model.
We find that T5 is able to learn the truncated KG embeddings without any change of loss function, improving KGQA performance.
arXiv Detail & Related papers (2023-03-23T14:06:26Z) - Semantic Parsing for Conversational Question Answering over Knowledge
Graphs [63.939700311269156]
We develop a dataset where user questions are annotated with Sparql parses and system answers correspond to execution results thereof.
We present two different semantic parsing approaches and highlight the challenges of the task.
Our dataset and models are released at https://github.com/Edinburgh/SPICE.
arXiv Detail & Related papers (2023-01-28T14:45:11Z) - Binding Language Models in Symbolic Languages [146.3027328556881]
Binder is a training-free neural-symbolic framework that maps the task input to a program.
In the parsing stage, Codex is able to identify the part of the task input that cannot be answerable by the original programming language.
In the execution stage, Codex can perform versatile functionalities given proper prompts in the API calls.
arXiv Detail & Related papers (2022-10-06T12:55:17Z) - Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding.
Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z) - AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question
Generation from SPARQL [18.019353543946913]
This study investigates the task of knowledge-based question generation (KBQG)
Conventional KBQG works generated questions from fact triples in the knowledge graph, which could not express complex operations like aggregation and comparison in SPARQL.
We propose an auto-prompter trained on large-scale unsupervised data to rephrase SPARQL into NL description.
arXiv Detail & Related papers (2022-08-26T06:53:46Z) - Improving Candidate Retrieval with Entity Profile Generation for
Wikidata Entity Linking [76.00737707718795]
We propose a novel candidate retrieval paradigm based on entity profiling.
We use the profile to query the indexed search engine to retrieve candidate entities.
Our approach complements the traditional approach of using a Wikipedia anchor-text dictionary.
arXiv Detail & Related papers (2022-02-27T17:38:53Z) - SPBERT: Pre-training BERT on SPARQL Queries for End-to-end Question
Answering over Knowledge Graphs [1.1775939485654976]
SPBERT is a Transformer-based language model pre-trained on massive SPARQL query logs.
We investigate how SPBERT and encoder-decoder architecture can be adapted for Knowledge-based QA corpora.
arXiv Detail & Related papers (2021-06-18T08:39:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.