UNITE: A Unified Benchmark for Text-to-SQL Evaluation
- URL: http://arxiv.org/abs/2305.16265v3
- Date: Fri, 14 Jul 2023 15:56:31 GMT
- Title: UNITE: A Unified Benchmark for Text-to-SQL Evaluation
- Authors: Wuwei Lan, Zhiguo Wang, Anuj Chauhan, Henghui Zhu, Alexander Li, Jiang
Guo, Sheng Zhang, Chung-Wei Hang, Joseph Lilien, Yiqun Hu, Lin Pan, Mingwen
Dong, Jun Wang, Jiarong Jiang, Stephen Ash, Vittorio Castelli, Patrick Ng and
Bing Xiang
- Abstract summary: We introduce a UNIfied benchmark for Text-to-domain systems.
It is composed of publicly available text-to-domain datasets and 29K databases.
Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
- Score: 72.72040379293718
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A practical text-to-SQL system should generalize well on a wide variety of
natural language questions, unseen database schemas, and novel SQL query
structures. To comprehensively evaluate text-to-SQL systems, we introduce a
UNIfied benchmark for Text-to-SQL Evaluation (UNITE). It is composed of
publicly available text-to-SQL datasets, containing natural language questions
from more than 12 domains, SQL queries from more than 3.9K patterns, and 29K
databases. Compared to the widely used Spider benchmark, we introduce
$\sim$120K additional examples and a threefold increase in SQL patterns, such
as comparative and boolean questions. We conduct a systematic study of six
state-of-the-art (SOTA) text-to-SQL parsers on our new benchmark and show that:
1) Codex performs surprisingly well on out-of-domain datasets; 2) specially
designed decoding methods (e.g. constrained beam search) can improve
performance for both in-domain and out-of-domain settings; 3) explicitly
modeling the relationship between questions and schemas further improves the
Seq2Seq models. More importantly, our benchmark presents key challenges towards
compositional generalization and robustness issues -- which these SOTA models
cannot address well. Our code and data processing script are available at
https://github.com/awslabs/unified-text2sql-benchmark
Related papers
- SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation [16.07396492960869]
We introduce a novel Transformer architecture specifically crafted to perform text-to-gressive translation tasks.
Our model predicts queries as abstract syntax trees (ASTs) in an autore way, incorporating structural inductive bias in the executable and decoder layers.
arXiv Detail & Related papers (2023-10-27T00:13:59Z) - Benchmarking and Improving Text-to-SQL Generation under Ambiguity [25.283118418288293]
We develop a novel benchmark called AmbiQT where each text is interpretable as two plausible SQLs due to lexical and/or structural ambiguity.
We propose LogicalBeam, a new decoding algorithm that navigates thesql logic space using a blend of plan-based template generation and constrained infilling.
arXiv Detail & Related papers (2023-10-20T17:00:53Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - Can LLM Already Serve as A Database Interface? A BIg Bench for
Large-Scale Database Grounded Text-to-SQLs [89.68522473384522]
We present Bird, a big benchmark for large-scale database grounded in text-to-efficient tasks.
Our emphasis on database values highlights the new challenges of dirty database contents.
Even the most effective text-to-efficient models, i.e. ChatGPT, achieves only 40.08% in execution accuracy.
arXiv Detail & Related papers (2023-05-04T19:02:29Z) - Prompting GPT-3.5 for Text-to-SQL with De-semanticization and Skeleton
Retrieval [17.747079214502673]
Text-to- is a task that converts a natural language question into a structured query language () to retrieve information from a database.
In this paper, we propose an LLM-based framework for Text-to- which retrieves helpful demonstration examples to prompt LLMs.
We design a de-semanticization mechanism that extracts question skeletons, allowing us to retrieve similar examples based on their structural similarity.
arXiv Detail & Related papers (2023-04-26T06:02:01Z) - Towards Generalizable and Robust Text-to-SQL Parsing [77.18724939989647]
We propose a novel TKK framework consisting of Task decomposition, Knowledge acquisition, and Knowledge composition to learn text-to- parsing in stages.
We show that our framework is effective in all scenarios and state-of-the-art performance on the Spider, SParC, and Co. datasets.
arXiv Detail & Related papers (2022-10-23T09:21:27Z) - A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future
Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases.
Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z) - Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR.
Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries.
Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z) - Data Augmentation with Hierarchical SQL-to-Question Generation for
Cross-domain Text-to-SQL Parsing [40.65143087243074]
This paper presents a simple yet effective data augmentation framework.
First, given a database, we automatically produce a large amount ofsql queries based on an abstract syntax tree grammar citeyintranx.
Second, we propose a hierarchicalsql-to-question generation model to obtain high-quality natural language questions.
arXiv Detail & Related papers (2021-03-03T07:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.