SPARQLing Database Queries from Intermediate Question Decompositions
- URL: http://arxiv.org/abs/2109.06162v1
- Date: Mon, 13 Sep 2021 17:57:12 GMT
- Title: SPARQLing Database Queries from Intermediate Question Decompositions
- Authors: Irina Saparina, Anton Osokin
- Abstract summary: To translate natural language questions into database queries, most approaches rely on a fully annotated training set.
We reduce this burden using grounded in databases intermediate question representations.
Our pipeline consists of two parts: a semantic that converts natural language questions into the intermediate representations and a non-trainable transpiler to the QLSPAR query language.
- Score: 7.475027071883912
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To translate natural language questions into executable database queries,
most approaches rely on a fully annotated training set. Annotating a large
dataset with queries is difficult as it requires query-language expertise. We
reduce this burden using grounded in databases intermediate question
representations. These representations are simpler to collect and were
originally crowdsourced within the Break dataset (Wolfson et al., 2020). Our
pipeline consists of two parts: a neural semantic parser that converts natural
language questions into the intermediate representations and a non-trainable
transpiler to the SPARQL query language (a standard language for accessing
knowledge graphs and semantic web). We chose SPARQL because its queries are
structurally closer to our intermediate representations (compared to SQL). We
observe that the execution accuracy of queries constructed by our model on the
challenging Spider dataset is comparable with the state-of-the-art text-to-SQL
methods trained with annotated SQL queries. Our code and data are publicly
available (see https://github.com/yandex-research/sparqling-queries).
Related papers
- UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - NL2KQL: From Natural Language to Kusto Query [1.7931930942711818]
NL2KQL is an innovative framework that uses large language models (LLMs) to convert natural language queries (NLQs) to Kusto Query Language (KQL) queries.
To validate NL2KQL's performance, we utilize an array of online (based on query execution) and offline (based on query parsing) metrics.
arXiv Detail & Related papers (2024-04-03T01:09:41Z) - Semantic Decomposition of Question and SQL for Text-to-SQL Parsing [2.684900573255764]
We propose a new modular Query Plan Language (QPL) that systematically decomposessql queries into simple and regular sub-queries.
Experimental results demonstrate that QPL is more effective than text-to-QPL for semantically equivalent queries.
arXiv Detail & Related papers (2023-10-20T15:13:34Z) - Semantic Parsing for Conversational Question Answering over Knowledge
Graphs [63.939700311269156]
We develop a dataset where user questions are annotated with Sparql parses and system answers correspond to execution results thereof.
We present two different semantic parsing approaches and highlight the challenges of the task.
Our dataset and models are released at https://github.com/Edinburgh/SPICE.
arXiv Detail & Related papers (2023-01-28T14:45:11Z) - Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding.
Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z) - A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future
Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases.
Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z) - Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR.
Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries.
Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z) - Reducing the impact of out of vocabulary words in the translation of
natural language questions into SPARQL queries [5.97507595130844]
Automatic translation of questions posed in natural language in SPARQL has the potential of overcoming this problem.
Existing systems based on neural-machine translation are very effective but easily fail in recognizing words that are Out Of The Vocabulary (OOV) of the training set.
arXiv Detail & Related papers (2021-11-04T16:53:59Z) - Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open
Domain Question Answering [78.9863753810787]
A large amount of world's knowledge is stored in structured databases.
query languages can answer questions that require complex reasoning, as well as offering full explainability.
arXiv Detail & Related papers (2021-08-05T22:04:13Z) - ColloQL: Robust Cross-Domain Text-to-SQL Over Search Queries [10.273545005890496]
We introduce data augmentation techniques and a sampling-based content-aware BERT model (ColloQL)
ColloQL achieves 84.9% (execution) and 90.7% (execution) accuracy on the Wikilogical dataset.
arXiv Detail & Related papers (2020-10-19T23:53:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.