Related papers: SPARQLing Database Queries from Intermediate Question Decompositions

SPARQLing Database Queries from Intermediate Question Decompositions

URL: http://arxiv.org/abs/2109.06162v1
Date: Mon, 13 Sep 2021 17:57:12 GMT
Title: SPARQLing Database Queries from Intermediate Question Decompositions
Authors: Irina Saparina, Anton Osokin
Abstract summary: To translate natural language questions into database queries, most approaches rely on a fully annotated training set. We reduce this burden using grounded in databases intermediate question representations. Our pipeline consists of two parts: a semantic that converts natural language questions into the intermediate representations and a non-trainable transpiler to the QLSPAR query language.
Score: 7.475027071883912
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To translate natural language questions into executable database queries, most approaches rely on a fully annotated training set. Annotating a large dataset with queries is difficult as it requires query-language expertise. We reduce this burden using grounded in databases intermediate question representations. These representations are simpler to collect and were originally crowdsourced within the Break dataset (Wolfson et al., 2020). Our pipeline consists of two parts: a neural semantic parser that converts natural language questions into the intermediate representations and a non-trainable transpiler to the SPARQL query language (a standard language for accessing knowledge graphs and semantic web). We chose SPARQL because its queries are structurally closer to our intermediate representations (compared to SQL). We observe that the execution accuracy of queries constructed by our model on the challenging Spider dataset is comparable with the state-of-the-art text-to-SQL methods trained with annotated SQL queries. Our code and data are publicly available (see https://github.com/yandex-research/sparqling-queries).

Related papers

UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics. We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z)
NL2KQL: From Natural Language to Kusto Query [1.7931930942711818]
NL2KQL is an innovative framework that uses large language models (LLMs) to convert natural language queries (NLQs) to Kusto Query Language (KQL) queries. To validate NL2KQL's performance, we utilize an array of online (based on query execution) and offline (based on query parsing) metrics.
arXiv Detail & Related papers (2024-04-03T01:09:41Z)
Semantic Decomposition of Question and SQL for Text-to-SQL Parsing [2.684900573255764]
We propose a new modular Query Plan Language (QPL) that systematically decomposessql queries into simple and regular sub-queries. Experimental results demonstrate that QPL is more effective than text-to-QPL for semantically equivalent queries.
arXiv Detail & Related papers (2023-10-20T15:13:34Z)
Semantic Parsing for Conversational Question Answering over Knowledge Graphs [63.939700311269156]
We develop a dataset where user questions are annotated with Sparql parses and system answers correspond to execution results thereof. We present two different semantic parsing approaches and highlight the challenges of the task. Our dataset and models are released at https://github.com/Edinburgh/SPICE.
arXiv Detail & Related papers (2023-01-28T14:45:11Z)
Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding. Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z)
A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases. Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z)
Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR. Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries. Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z)
Reducing the impact of out of vocabulary words in the translation of natural language questions into SPARQL queries [5.97507595130844]
Automatic translation of questions posed in natural language in SPARQL has the potential of overcoming this problem. Existing systems based on neural-machine translation are very effective but easily fail in recognizing words that are Out Of The Vocabulary (OOV) of the training set.
arXiv Detail & Related papers (2021-11-04T16:53:59Z)
Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering [78.9863753810787]
A large amount of world's knowledge is stored in structured databases. query languages can answer questions that require complex reasoning, as well as offering full explainability.
arXiv Detail & Related papers (2021-08-05T22:04:13Z)
ColloQL: Robust Cross-Domain Text-to-SQL Over Search Queries [10.273545005890496]
We introduce data augmentation techniques and a sampling-based content-aware BERT model (ColloQL) ColloQL achieves 84.9% (execution) and 90.7% (execution) accuracy on the Wikilogical dataset.
arXiv Detail & Related papers (2020-10-19T23:53:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.