Related papers: Weakly Supervised Text-to-SQL Parsing through Question Decomposition

Weakly Supervised Text-to-SQL Parsing through Question Decomposition

URL: http://arxiv.org/abs/2112.06311v4
Date: Fri, 2 Aug 2024 14:21:43 GMT
Title: Weakly Supervised Text-to-SQL Parsing through Question Decomposition
Authors: Tomer Wolfson, Daniel Deutch, Jonathan Berant,
Abstract summary: We take advantage of the recently proposed question meaning representation called QDMR. Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries. Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
Score: 53.22128541030441
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-to-SQL parsers are crucial in enabling non-experts to effortlessly query relational data. Training such parsers, by contrast, generally requires expertise in annotating natural language (NL) utterances with corresponding SQL queries. In this work, we propose a weak supervision approach for training text-to-SQL parsers. We take advantage of the recently proposed question meaning representation called QDMR, an intermediate between NL and formal query languages. Given questions, their QDMR structures (annotated by non-experts or automatically predicted), and the answers, we are able to automatically synthesize SQL queries that are used to train text-to-SQL models. We test our approach by experimenting on five benchmark datasets. Our results show that the weakly supervised models perform competitively with those trained on annotated NL-SQL data. Overall, we effectively train text-to-SQL parsers, while using zero SQL annotations.

Related papers

Text-to-SQL Domain Adaptation via Human-LLM Collaborative Data Annotation [26.834687657847454]
Text-to-sql models are increasingly adopted in real-world applications. deploying such models in the real world often requires adapting them to the highly specialized database schemas used in specific applications. We find that existing text-to-sql models experience significant performance drops when applied to new schemas. Continuously obtaining high-quality text-to-sql data for evolving schemas is prohibitively expensive in real-world scenarios.
arXiv Detail & Related papers (2025-02-21T22:32:35Z)
SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation [16.07396492960869]
We introduce a novel Transformer architecture specifically crafted to perform text-to-gressive translation tasks. Our model predicts queries as abstract syntax trees (ASTs) in an autore way, incorporating structural inductive bias in the executable and decoder layers.
arXiv Detail & Related papers (2023-10-27T00:13:59Z)
UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems. It is composed of publicly available text-to-domain datasets and 29K databases. Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z)
Error Detection for Text-to-SQL Semantic Parsing [18.068244400731366]
Modern text-to- semantics are often over-confident, casting doubt on their trustworthiness when deployed for real use. We propose a-independent error detection model for text-to- semantic parsing.
arXiv Detail & Related papers (2023-05-23T04:44:22Z)
Towards Generalizable and Robust Text-to-SQL Parsing [77.18724939989647]
We propose a novel TKK framework consisting of Task decomposition, Knowledge acquisition, and Knowledge composition to learn text-to- parsing in stages. We show that our framework is effective in all scenarios and state-of-the-art performance on the Spider, SParC, and Co. datasets.
arXiv Detail & Related papers (2022-10-23T09:21:27Z)
STAR: SQL Guided Pre-Training for Context-dependent Text-to-SQL Parsing [64.80483736666123]
We propose a novel pre-training framework STAR for context-dependent text-to- parsing. In addition, we construct a large-scale context-dependent text-to-the-art conversation corpus to pre-train STAR. Extensive experiments show that STAR achieves new state-of-the-art performance on two downstream benchmarks.
arXiv Detail & Related papers (2022-10-21T11:30:07Z)
A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases. Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z)
CQR-SQL: Conversational Question Reformulation Enhanced Context-Dependent Text-to-SQL Parsers [35.36754559708944]
Context-dependent text-to-reference is the task of translating multi-turn questions into database-related queries. In this paper, we propose CQR-couple, which uses auxiliary Conversational Question Reformulation (CQR) learning to explicitly exploit and decouple contextual dependency forsql parsing. At the time of writing, our CQR-couple achieves new state-of-the-art results on two context-dependent benchmarks SParC and Co.
arXiv Detail & Related papers (2022-05-16T13:52:42Z)
S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers [66.78665327694625]
We propose S$2$, injecting Syntax to question- encoder graph for Text-to- relational parsing. We also employ the decoupling constraint to induce diverse edge embedding, which further improves the network's performance. Experiments on the Spider and robustness setting Spider-Syn demonstrate that the proposed approach outperforms all existing methods when pre-training models are used.
arXiv Detail & Related papers (2022-03-14T09:49:15Z)
Natural SQL: Making SQL Easier to Infer from Natural Language Specifications [15.047104267689052]
We propose an SQL intermediate representation called Natural SQL (Nat) On Spider, a challenging text-to- schema benchmark, we demonstrate that Nat outperforms other IRs, and significantly improves the performance of several previous SOTA models. For existing models that do not support executable generation, Nat easily enables them to generate executable queries, and achieves the new state-of-the-art execution accuracy.
arXiv Detail & Related papers (2021-09-11T01:53:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.