Semantic Decomposition of Question and SQL for Text-to-SQL Parsing
- URL: http://arxiv.org/abs/2310.13575v1
- Date: Fri, 20 Oct 2023 15:13:34 GMT
- Title: Semantic Decomposition of Question and SQL for Text-to-SQL Parsing
- Authors: Ben Eyal, Amir Bachar, Ophir Haroche, Moran Mahabi, Michael Elhadad
- Abstract summary: We propose a new modular Query Plan Language (QPL) that systematically decomposessql queries into simple and regular sub-queries.
Experimental results demonstrate that QPL is more effective than text-to-QPL for semantically equivalent queries.
- Score: 2.684900573255764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-SQL semantic parsing faces challenges in generalizing to cross-domain
and complex queries. Recent research has employed a question decomposition
strategy to enhance the parsing of complex SQL queries. However, this strategy
encounters two major obstacles: (1) existing datasets lack question
decomposition; (2) due to the syntactic complexity of SQL, most complex queries
cannot be disentangled into sub-queries that can be readily recomposed. To
address these challenges, we propose a new modular Query Plan Language (QPL)
that systematically decomposes SQL queries into simple and regular sub-queries.
We develop a translator from SQL to QPL by leveraging analysis of SQL server
query optimization plans, and we augment the Spider dataset with QPL programs.
Experimental results demonstrate that the modular nature of QPL benefits
existing semantic-parsing architectures, and training text-to-QPL parsers is
more effective than text-to-SQL parsing for semantically equivalent queries.
The QPL approach offers two additional advantages: (1) QPL programs can be
paraphrased as simple questions, which allows us to create a dataset of
(complex question, decomposed questions). Training on this dataset, we obtain a
Question Decomposer for data retrieval that is sensitive to database schemas.
(2) QPL is more accessible to non-experts for complex queries, leading to more
interpretable output from the semantic parser.
Related papers
- Bridging the Gap: Transforming Natural Language Questions into SQL Queries via Abstract Query Pattern and Contextual Schema Markup [6.249316460506702]
We identify two important gaps: the structural mapping gap and the lexical mapping gap.
PAS-related achieves an execution accuracy of 87.9%, and leading results on the BIRD dataset with an execution accuracy of 64.67%.
Results on the Spider benchmark set a new state-of-the-art on the Spider benchmark with an execution accuracy of 87.9%, and leading results on the BIRD dataset with an execution accuracy of 64.67%.
arXiv Detail & Related papers (2025-02-20T16:11:27Z) - E-SQL: Direct Schema Linking via Question Enrichment in Text-to-SQL [1.187832944550453]
We introduce E-Seek, a novel pipeline specifically designed to address these challenges through direct schema linking and candidate predicate augmentation.
E-Seek enhances the natural language query by incorporating relevant database items (i.e., tables, columns, and values) and conditions directly into the question andsql construction plan, bridging the gap between the query and the database structure.
Comprehensive evaluations illustrate that E-Seek achieves competitive performance, particularly excelling in complex queries with a 66.29% execution accuracy on the test set.
arXiv Detail & Related papers (2024-09-25T09:02:48Z) - Schema-Aware Multi-Task Learning for Complex Text-to-SQL [4.913409359995421]
We present a schema-aware multi-task learning framework (named MT) for complicatedsql queries.
Specifically, we design a schema linking discriminator module to distinguish the valid question-schema linkings.
On the decoder side, we define 6-type relationships to describe the connections between tables and columns.
arXiv Detail & Related papers (2024-03-09T01:13:37Z) - Semantic Parsing for Complex Data Retrieval: Targeting Query Plans vs.
SQL for No-Code Access to Relational Databases [2.933060994339853]
We investigate the potential of an alternative query language with simpler syntax and modular specification of complex queries.
The proposed alternative query language is called Query Plan Language (QPL)
We present ways to address the challenge of complex queries in an iterative, user-controlled manner.
arXiv Detail & Related papers (2023-12-22T16:16:15Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems.
It is composed of publicly available text-to-domain datasets and 29K databases.
Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z) - Towards Generalizable and Robust Text-to-SQL Parsing [77.18724939989647]
We propose a novel TKK framework consisting of Task decomposition, Knowledge acquisition, and Knowledge composition to learn text-to- parsing in stages.
We show that our framework is effective in all scenarios and state-of-the-art performance on the Spider, SParC, and Co. datasets.
arXiv Detail & Related papers (2022-10-23T09:21:27Z) - Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding.
Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z) - Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR.
Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries.
Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z) - SPARQLing Database Queries from Intermediate Question Decompositions [7.475027071883912]
To translate natural language questions into database queries, most approaches rely on a fully annotated training set.
We reduce this burden using grounded in databases intermediate question representations.
Our pipeline consists of two parts: a semantic that converts natural language questions into the intermediate representations and a non-trainable transpiler to the QLSPAR query language.
arXiv Detail & Related papers (2021-09-13T17:57:12Z) - Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open
Domain Question Answering [78.9863753810787]
A large amount of world's knowledge is stored in structured databases.
query languages can answer questions that require complex reasoning, as well as offering full explainability.
arXiv Detail & Related papers (2021-08-05T22:04:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.