Reducing the impact of out of vocabulary words in the translation of
natural language questions into SPARQL queries
- URL: http://arxiv.org/abs/2111.03000v1
- Date: Thu, 4 Nov 2021 16:53:59 GMT
- Title: Reducing the impact of out of vocabulary words in the translation of
natural language questions into SPARQL queries
- Authors: Manuel A. Borroto Santana, Francesco Ricca, Bernardo Cuteri
- Abstract summary: Automatic translation of questions posed in natural language in SPARQL has the potential of overcoming this problem.
Existing systems based on neural-machine translation are very effective but easily fail in recognizing words that are Out Of The Vocabulary (OOV) of the training set.
- Score: 5.97507595130844
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accessing the large volumes of information available in public knowledge
bases might be complicated for those users unfamiliar with the SPARQL query
language. Automatic translation of questions posed in natural language in
SPARQL has the potential of overcoming this problem. Existing systems based on
neural-machine translation are very effective but easily fail in recognizing
words that are Out Of the Vocabulary (OOV) of the training set. This is a
serious issue while querying large ontologies. In this paper, we combine Named
Entity Linking, Named Entity Recognition, and Neural Machine Translation to
perform automatic translation of natural language questions into SPARQL
queries. We demonstrate empirically that our approach is more effective and
resilient to OOV words than existing approaches by running the experiments on
Monument, QALD-9, and LC-QuAD v1, which are well-known datasets for Question
Answering over DBpedia.
Related papers
- MST5 -- Multilingual Question Answering over Knowledge Graphs [1.6470999044938401]
Knowledge Graph Question Answering (KGQA) simplifies querying vast amounts of knowledge stored in a graph-based model using natural language.
Existing multilingual KGQA systems face challenges in achieving performance comparable to English systems.
We propose a simplified approach to enhance multilingual KGQA systems by incorporating linguistic context and entity information directly into the processing pipeline of a language model.
arXiv Detail & Related papers (2024-07-08T15:37:51Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - An In-Context Schema Understanding Method for Knowledge Base Question
Answering [70.87993081445127]
Large Language Models (LLMs) have shown strong capabilities in language understanding and can be used to solve this task.
Existing methods bypass this challenge by initially employing LLMs to generate drafts of logic forms without schema-specific details.
We propose a simple In-Context Understanding (ICSU) method that enables LLMs to directly understand schemas by leveraging in-context learning.
arXiv Detail & Related papers (2023-10-22T04:19:17Z) - The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing [20.734859343886843]
We analyse the role of output vocabulary for text-to-text (T2T) models on the task of SPARQL semantic parsing.
We carry out carefully selected vocabulary substitutions on the queries and find absolute gains in the range of 17% on the GrailQA dataset.
arXiv Detail & Related papers (2023-05-24T12:55:04Z) - Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES.
Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query.
By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z) - Cross-Lingual Question Answering over Knowledge Base as Reading
Comprehension [61.079852289005025]
Cross-lingual question answering over knowledge base (xKBQA) aims to answer questions in languages different from that of the provided knowledge base.
One of the major challenges facing xKBQA is the high cost of data annotation.
We propose a novel approach for xKBQA in a reading comprehension paradigm.
arXiv Detail & Related papers (2023-02-26T05:52:52Z) - Semantic Parsing for Conversational Question Answering over Knowledge
Graphs [63.939700311269156]
We develop a dataset where user questions are annotated with Sparql parses and system answers correspond to execution results thereof.
We present two different semantic parsing approaches and highlight the challenges of the task.
Our dataset and models are released at https://github.com/Edinburgh/SPICE.
arXiv Detail & Related papers (2023-01-28T14:45:11Z) - AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question
Generation from SPARQL [18.019353543946913]
This study investigates the task of knowledge-based question generation (KBQG)
Conventional KBQG works generated questions from fact triples in the knowledge graph, which could not express complex operations like aggregation and comparison in SPARQL.
We propose an auto-prompter trained on large-scale unsupervised data to rephrase SPARQL into NL description.
arXiv Detail & Related papers (2022-08-26T06:53:46Z) - QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia
and Wikidata Translated by Native Speakers [68.9964449363406]
We extend one of the most popular KGQA benchmarks - QALD-9 by introducing high-quality questions' translations to 8 languages.
Five of the languages - Armenian, Ukrainian, Lithuanian, Bashkir and Belarusian - to our best knowledge were never considered in KGQA research community before.
arXiv Detail & Related papers (2022-01-31T22:19:55Z) - SPARQLing Database Queries from Intermediate Question Decompositions [7.475027071883912]
To translate natural language questions into database queries, most approaches rely on a fully annotated training set.
We reduce this burden using grounded in databases intermediate question representations.
Our pipeline consists of two parts: a semantic that converts natural language questions into the intermediate representations and a non-trainable transpiler to the QLSPAR query language.
arXiv Detail & Related papers (2021-09-13T17:57:12Z) - SPBERT: Pre-training BERT on SPARQL Queries for End-to-end Question
Answering over Knowledge Graphs [1.1775939485654976]
SPBERT is a Transformer-based language model pre-trained on massive SPARQL query logs.
We investigate how SPBERT and encoder-decoder architecture can be adapted for Knowledge-based QA corpora.
arXiv Detail & Related papers (2021-06-18T08:39:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.