SPBERT: Pre-training BERT on SPARQL Queries for End-to-end Question
Answering over Knowledge Graphs
- URL: http://arxiv.org/abs/2106.09997v1
- Date: Fri, 18 Jun 2021 08:39:26 GMT
- Title: SPBERT: Pre-training BERT on SPARQL Queries for End-to-end Question
Answering over Knowledge Graphs
- Authors: Hieu Tran, Long Phan, and Truong-Son Nguyen
- Abstract summary: SPBERT is a Transformer-based language model pre-trained on massive SPARQL query logs.
We investigate how SPBERT and encoder-decoder architecture can be adapted for Knowledge-based QA corpora.
- Score: 1.1775939485654976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We aim to create an unprecedented attempt to build an end-to-end Question
Answering (QA) over Knowledge Graphs (KGs), which can construct SPARQL queries
from natural language questions and generate a verbalized answer to its
queries. Hence, we introduce SPBERT, a Transformer-based language model
pre-trained on massive SPARQL query logs. By incorporating masked language
modelling objective and word structural objective, SPBERT can learn
general-purpose representations in both natural language and SPARQL query
language and make the most of the sequential order of words that are crucial
for structured language like SPARQL. In this paper, we investigate how SPBERT
and encoder-decoder architecture can be adapted for Knowledge-based QA corpora.
We conduct exhaustive experiments on two auxiliary tasks, including SPARQL
Query Construction and Answer Verbalization Generation. Results show that
SPBERT obtains promising performance and achieves state-of-the-art results on
several of these tasks.
Related papers
- Effective Instruction Parsing Plugin for Complex Logical Query Answering on Knowledge Graphs [51.33342412699939]
Knowledge Graph Query Embedding (KGQE) aims to embed First-Order Logic (FOL) queries in a low-dimensional KG space for complex reasoning over incomplete KGs.
Recent studies integrate various external information (such as entity types and relation context) to better capture the logical semantics of FOL queries.
We propose an effective Query Instruction Parsing (QIPP) that captures latent query patterns from code-like query instructions.
arXiv Detail & Related papers (2024-10-27T03:18:52Z) - Assessing SPARQL capabilities of Large Language Models [0.0]
We focus on measuring out-of-the box capabilities of Large Language Models to work with SPARQL.
We implement benchmarking tasks in the LLM-KG-Bench framework for automated execution and evaluation.
Our findings indicate that working with SPARQL SELECT queries is still challenging for LLMs.
arXiv Detail & Related papers (2024-09-09T08:29:39Z) - UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - Leveraging LLMs in Scholarly Knowledge Graph Question Answering [7.951847862547378]
KGQA answers natural language questions by leveraging a large language model (LLM)
Our system achieves an F1 score of 99.0% on SciQA - one of the Scholarly Knowledge Graph Question Answering challenge benchmarks.
arXiv Detail & Related papers (2023-11-16T12:13:49Z) - In-Context Learning for Knowledge Base Question Answering for Unmanned
Systems based on Large Language Models [43.642717344626355]
We focus on the CCKS2023 Competition of Question Answering with Knowledge Graph Inference for Unmanned Systems.
Inspired by the recent success of large language models (LLMs) like ChatGPT and GPT-3 in many QA tasks, we propose a ChatGPT-based Cypher Query Language (CQL) generation framework.
With our ChatGPT-based CQL generation framework, we achieved the second place in the CCKS 2023 Question Answering with Knowledge Graph Inference for Unmanned Systems competition.
arXiv Detail & Related papers (2023-11-06T08:52:11Z) - An In-Context Schema Understanding Method for Knowledge Base Question
Answering [70.87993081445127]
Large Language Models (LLMs) have shown strong capabilities in language understanding and can be used to solve this task.
Existing methods bypass this challenge by initially employing LLMs to generate drafts of logic forms without schema-specific details.
We propose a simple In-Context Understanding (ICSU) method that enables LLMs to directly understand schemas by leveraging in-context learning.
arXiv Detail & Related papers (2023-10-22T04:19:17Z) - Semantic Parsing for Conversational Question Answering over Knowledge
Graphs [63.939700311269156]
We develop a dataset where user questions are annotated with Sparql parses and system answers correspond to execution results thereof.
We present two different semantic parsing approaches and highlight the challenges of the task.
Our dataset and models are released at https://github.com/Edinburgh/SPICE.
arXiv Detail & Related papers (2023-01-28T14:45:11Z) - Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding.
Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z) - Reducing the impact of out of vocabulary words in the translation of
natural language questions into SPARQL queries [5.97507595130844]
Automatic translation of questions posed in natural language in SPARQL has the potential of overcoming this problem.
Existing systems based on neural-machine translation are very effective but easily fail in recognizing words that are Out Of The Vocabulary (OOV) of the training set.
arXiv Detail & Related papers (2021-11-04T16:53:59Z) - Exploring Sequence-to-Sequence Models for SPARQL Pattern Composition [0.5639451539396457]
A booming amount of information is continuously added to the Internet as structured and unstructured data, feeding knowledge bases such as DBpedia and Wikidata.
The aim of Question Answering systems is to allow lay users to access such data using natural language without needing to write formal queries.
We show that sequence-to-sequence models are a viable and promising option to transform long utterances into complex SPARQL queries.
arXiv Detail & Related papers (2020-10-21T11:12:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.