Reboost Large Language Model-based Text-to-SQL, Text-to-Python, and
Text-to-Function -- with Real Applications in Traffic Domain
- URL: http://arxiv.org/abs/2310.18752v2
- Date: Tue, 31 Oct 2023 12:51:09 GMT
- Title: Reboost Large Language Model-based Text-to-SQL, Text-to-Python, and
Text-to-Function -- with Real Applications in Traffic Domain
- Authors: Guanghu Sui, Zhishuai Li, Ziyue Li, Sun Yang, Jingqing Ruan, Hangyu
Mao, Rui Zhao
- Abstract summary: Previous state-of-the-art (SOTA) method achieved remarkable execution accuracy on the Spider dataset.
We develop a more adaptable and more general prompting method, involving query rewriting andsql boosting.
In terms of execution accuracy on the business dataset, the SOTA method scored 21.05, while our approach scored 65.79.
- Score: 14.194710636073808
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The previous state-of-the-art (SOTA) method achieved a remarkable execution
accuracy on the Spider dataset, which is one of the largest and most diverse
datasets in the Text-to-SQL domain. However, during our reproduction of the
business dataset, we observed a significant drop in performance. We examined
the differences in dataset complexity, as well as the clarity of questions'
intentions, and assessed how those differences could impact the performance of
prompting methods. Subsequently, We develop a more adaptable and more general
prompting method, involving mainly query rewriting and SQL boosting, which
respectively transform vague information into exact and precise information and
enhance the SQL itself by incorporating execution feedback and the query
results from the database content. In order to prevent information gaps, we
include the comments, value types, and value samples for columns as part of the
database description in the prompt. Our experiments with Large Language Models
(LLMs) illustrate the significant performance improvement on the business
dataset and prove the substantial potential of our method. In terms of
execution accuracy on the business dataset, the SOTA method scored 21.05, while
our approach scored 65.79. As a result, our approach achieved a notable
performance improvement even when using a less capable pre-trained language
model. Last but not least, we also explore the Text-to-Python and
Text-to-Function options, and we deeply analyze the pros and cons among them,
offering valuable insights to the community.
Related papers
- Enhancing LLM Fine-tuning for Text-to-SQLs by SQL Quality Measurement [1.392448435105643]
Text-to-s enables non-expert users to effortlessly retrieve desired information from databases using natural language queries.
Current state-of-the-art (SOTA) models like GPT4 and T5 have shown impressive performance on large-scale benchmarks like BIRD.
This paper proposed a novel approach that only needs SQL Quality to enhance Text-to-s performance.
arXiv Detail & Related papers (2024-10-02T17:21:51Z) - FLEX: Expert-level False-Less EXecution Metric for Reliable Text-to-SQL Benchmark [8.445403382578167]
This paper introduces FLEX (False-Lesscution EXecution), a novel approach to evaluating text-to-technical systems.
Our metric improves agreement with human experts with comprehensive context and sophisticated criteria.
This work contributes to a more accurate and nuanced evaluation of text-to-technical systems, potentially reshaping our understanding of state-of-the-art performance in this field.
arXiv Detail & Related papers (2024-09-24T01:40:50Z) - DAC: Decomposed Automation Correction for Text-to-SQL [51.48239006107272]
We introduce De Automation Correction (DAC), which corrects text-to-composed by decomposing entity linking and skeleton parsing.
We show that our method improves performance by $3.7%$ on average of Spider, Bird, and KaggleDBQA compared with the baseline method.
arXiv Detail & Related papers (2024-08-16T14:43:15Z) - Improving Generalization in Semantic Parsing by Increasing Natural
Language Variation [67.13483734810852]
In this work, we use data augmentation to enhance robustness of text-to- semantic parsing.
We leverage the capabilities of large language models to generate more realistic and diverse questions.
Using only a few prompts, we achieve a two-fold increase in the number of questions in Spider.
arXiv Detail & Related papers (2024-02-13T18:48:23Z) - Evaluating the Data Model Robustness of Text-to-SQL Systems Based on Real User Queries [4.141402725050671]
This paper is the first in-depth evaluation of the data model robustness of Text-to-- systems in practice.
It is based on a real-world deployment of FootballDB, a system that was deployed over a 9 month period in the context of the FIFA World Cup 2022.
All of our data is based on real user questions that were asked live to the system. We manually labeled and translated a subset of these questions for three different data models.
arXiv Detail & Related papers (2024-02-13T10:28:57Z) - Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation [76.76046657162306]
Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
arXiv Detail & Related papers (2023-08-29T14:59:54Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - SPSQL: Step-by-step Parsing Based Framework for Text-to-SQL Generation [13.196264569882777]
The current mainstream end-to-end Text2 model is not only difficult to build due to its complex structure and high requirements for training data, but also difficult to adjust due to massive parameters.
This paper proposes a pipeline method: SP Experiments to achieve the desired result.
We construct the dataset based on the marketing business data of the State Grid Corporation of China.
arXiv Detail & Related papers (2023-05-10T10:01:36Z) - Can LLM Already Serve as A Database Interface? A BIg Bench for
Large-Scale Database Grounded Text-to-SQLs [89.68522473384522]
We present Bird, a big benchmark for large-scale database grounded in text-to-efficient tasks.
Our emphasis on database values highlights the new challenges of dirty database contents.
Even the most effective text-to-efficient models, i.e. ChatGPT, achieves only 40.08% in execution accuracy.
arXiv Detail & Related papers (2023-05-04T19:02:29Z) - SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN)
Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.