Related papers: Modern Baselines for SPARQL Semantic Parsing

Modern Baselines for SPARQL Semantic Parsing

URL: http://arxiv.org/abs/2204.12793v3
Date: Thu, 14 Sep 2023 08:50:25 GMT
Title: Modern Baselines for SPARQL Semantic Parsing
Authors: Debayan Banerjee, Pranav Ajit Nair, Jivat Neet Kaur, Ricardo Usbeck, Chris Biemann
Abstract summary: We focus on the task of generating SPARQL queries from natural language questions, which can then be executed on Knowledge Graphs. We show that T5 requires special input tokenisation, but produces state of the art performance on LC-QuAD 1.0 and LC-QuAD 2.0 datasets. The methods enable semantic parsing for questions where a part of the input needs to be copied to the output query, thus enabling a new paradigm in KG semantic parsing.
Score: 28.088516108293653
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we focus on the task of generating SPARQL queries from natural language questions, which can then be executed on Knowledge Graphs (KGs). We assume that gold entity and relations have been provided, and the remaining task is to arrange them in the right order along with SPARQL vocabulary, and input tokens to produce the correct SPARQL query. Pre-trained Language Models (PLMs) have not been explored in depth on this task so far, so we experiment with BART, T5 and PGNs (Pointer Generator Networks) with BERT embeddings, looking for new baselines in the PLM era for this task, on DBpedia and Wikidata KGs. We show that T5 requires special input tokenisation, but produces state of the art performance on LC-QuAD 1.0 and LC-QuAD 2.0 datasets, and outperforms task-specific models from previous works. Moreover, the methods enable semantic parsing for questions where a part of the input needs to be copied to the output query, thus enabling a new paradigm in KG semantic parsing.

Related papers

Text-to-SPARQL Goes Beyond English: Multilingual Question Answering Over Knowledge Graphs through Human-Inspired Reasoning [51.203811759364925]
mKGQAgent breaks down the task of converting natural language questions into SPARQL queries into modular, interpretable subtasks.<n> Evaluated on the DBpedia- and Corporate-based KGQA benchmarks within the Text2SPARQL challenge 2025, our approach took first place among the other participants.
arXiv Detail & Related papers (2025-07-22T19:23:03Z)
Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling [69.84963245729826]
Large language models (LLMs) have shown compelling semantic understanding capabilities. Dense retrieval is a crucial task in Information Retrieval (IR) and is the foundation for downstream tasks as re-ranking. We introduce an auxiliary task of QL estimation to yield a better backbone for contrast learning a discriminative retriever.
arXiv Detail & Related papers (2025-04-07T16:03:59Z)
Ontology-Guided, Hybrid Prompt Learning for Generalization in Knowledge Graph Question Answering [6.232269207752904]
We present OntoSCPrompt, a novel Large Language Model (LLM)-based KGQA approach with a two-stage architecture. OntoSCPrompt first generates a SPARQL query structure (including SPARQL keywords such as SELECT, ASK, WHERE and placeholders for missing tokens) and then fills them with KG-specific information. We present several task-specific decoding strategies to ensure the correctness and executability of generated SPARQL queries in both stages.
arXiv Detail & Related papers (2025-02-06T11:47:58Z)
Effective Instruction Parsing Plugin for Complex Logical Query Answering on Knowledge Graphs [51.33342412699939]
Knowledge Graph Query Embedding (KGQE) aims to embed First-Order Logic (FOL) queries in a low-dimensional KG space for complex reasoning over incomplete KGs. Recent studies integrate various external information (such as entity types and relation context) to better capture the logical semantics of FOL queries. We propose an effective Query Instruction Parsing (QIPP) that captures latent query patterns from code-like query instructions.
arXiv Detail & Related papers (2024-10-27T03:18:52Z)
Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models. Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models. Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z)
Enhancing SPARQL Generation by Triplet-order-sensitive Pre-training [13.57710774520144]
We propose an additional pre-training stage with a new objective, Triplet Order Correction (TOC), along with the commonly used Masked Language Modeling (MLM) Our method achieves state-of-the-art performances on three widely-used benchmarks.
arXiv Detail & Related papers (2024-10-08T06:48:46Z)
The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing [20.734859343886843]
We analyse the role of output vocabulary for text-to-text (T2T) models on the task of SPARQL semantic parsing. We carry out carefully selected vocabulary substitutions on the queries and find absolute gains in the range of 17% on the GrailQA dataset.
arXiv Detail & Related papers (2023-05-24T12:55:04Z)
GETT-QA: Graph Embedding based T2T Transformer for Knowledge Graph Question Answering [20.734859343886843]
We present an end-to-end Knowledge Graph Question Answering system named GETT-QA. GETT-QA uses T5, a popular text-to-text pre-trained language model. We find that T5 is able to learn the truncated KG embeddings without any change of loss function, improving KGQA performance.
arXiv Detail & Related papers (2023-03-23T14:06:26Z)
Semantic Parsing for Conversational Question Answering over Knowledge Graphs [63.939700311269156]
We develop a dataset where user questions are annotated with Sparql parses and system answers correspond to execution results thereof. We present two different semantic parsing approaches and highlight the challenges of the task. Our dataset and models are released at https://github.com/Edinburgh/SPICE.
arXiv Detail & Related papers (2023-01-28T14:45:11Z)
Binding Language Models in Symbolic Languages [146.3027328556881]
Binder is a training-free neural-symbolic framework that maps the task input to a program. In the parsing stage, Codex is able to identify the part of the task input that cannot be answerable by the original programming language. In the execution stage, Codex can perform versatile functionalities given proper prompts in the API calls.
arXiv Detail & Related papers (2022-10-06T12:55:17Z)
Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding. Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z)
AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question Generation from SPARQL [18.019353543946913]
This study investigates the task of knowledge-based question generation (KBQG) Conventional KBQG works generated questions from fact triples in the knowledge graph, which could not express complex operations like aggregation and comparison in SPARQL. We propose an auto-prompter trained on large-scale unsupervised data to rephrase SPARQL into NL description.
arXiv Detail & Related papers (2022-08-26T06:53:46Z)
Improving Candidate Retrieval with Entity Profile Generation for Wikidata Entity Linking [76.00737707718795]
We propose a novel candidate retrieval paradigm based on entity profiling. We use the profile to query the indexed search engine to retrieve candidate entities. Our approach complements the traditional approach of using a Wikipedia anchor-text dictionary.
arXiv Detail & Related papers (2022-02-27T17:38:53Z)
SPBERT: Pre-training BERT on SPARQL Queries for End-to-end Question Answering over Knowledge Graphs [1.1775939485654976]
SPBERT is a Transformer-based language model pre-trained on massive SPARQL query logs. We investigate how SPBERT and encoder-decoder architecture can be adapted for Knowledge-based QA corpora.
arXiv Detail & Related papers (2021-06-18T08:39:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.