Retrieval-augmented GPT-3.5-based Text-to-SQL Framework with
Sample-aware Prompting and Dynamic Revision Chain
- URL: http://arxiv.org/abs/2307.05074v2
- Date: Mon, 4 Sep 2023 08:10:03 GMT
- Title: Retrieval-augmented GPT-3.5-based Text-to-SQL Framework with
Sample-aware Prompting and Dynamic Revision Chain
- Authors: Chunxi Guo, Zhiliang Tian, Jintao Tang, Shasha Li, Zhihua Wen, Kaixuan
Wang and Ting Wang
- Abstract summary: We propose a Text-to-aware prompting framework, involving a sample and a dynamic revision chain.
Our approach incorporates sample demonstrations and fine-grained information related to the given question.
To generate executable and accuratesqls without human intervention, we design a dynamic revision chain which iteratively adapts fine-grained feedback.
- Score: 21.593701177605652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-SQL aims at generating SQL queries for the given natural language
questions and thus helping users to query databases. Prompt learning with large
language models (LLMs) has emerged as a recent approach, which designs prompts
to lead LLMs to understand the input question and generate the corresponding
SQL. However, it faces challenges with strict SQL syntax requirements. Existing
work prompts the LLMs with a list of demonstration examples (i.e. question-SQL
pairs) to generate SQL, but the fixed prompts can hardly handle the scenario
where the semantic gap between the retrieved demonstration and the input
question is large. In this paper, we propose a retrieval-augmented prompting
method for a LLM-based Text-to-SQL framework, involving sample-aware prompting
and a dynamic revision chain. Our approach incorporates sample-aware
demonstrations, which include the composition of SQL operators and fine-grained
information related to the given question. To retrieve questions sharing
similar intents with input questions, we propose two strategies for assisting
retrieval. Firstly, we leverage LLMs to simplify the original questions,
unifying the syntax and thereby clarifying the users' intentions. To generate
executable and accurate SQLs without human intervention, we design a dynamic
revision chain which iteratively adapts fine-grained feedback from the
previously generated SQL. Experimental results on three Text-to-SQL benchmarks
demonstrate the superiority of our method over strong baseline models.
Related papers
- RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL [48.516004807486745]
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to- task.
We propose RB-, a novel retrieval-based framework for in-context prompt engineering.
Experiment results demonstrate that our model achieves better performance than several competitive baselines on public datasets BIRD and Spider.
arXiv Detail & Related papers (2024-07-11T08:19:58Z) - CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions [22.493487741249716]
Large Language Models (LLMs) have been demonstrated to possess impressive capabilities in a variety of domains and tasks.
We investigate the issue of prompt design in the multi-turn text-to- task and attempt to enhance the LLMs' reasoning capacity.
arXiv Detail & Related papers (2024-05-04T16:56:14Z) - PET-SQL: A Prompt-Enhanced Two-Round Refinement of Text-to-SQL with Cross-consistency [19.067737007347613]
Methods achieve new SOTA results on the Spider benchmark, with an execution accuracy of 87.6%.
Our methods achieve new SOTA results on the Spider benchmark, with an execution accuracy of 87.6%.
arXiv Detail & Related papers (2024-03-13T02:32:41Z) - Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM [15.888784472807775]
Existing methods rely on the comprehensive capability of large language models (LLMs) to generate queries.
We propose the Knowledge-to- Data Expert framework, which employs tailored knowledge for all text-to- models.
arXiv Detail & Related papers (2024-02-18T09:10:04Z) - SQLPrompt: In-Context Text-to-SQL with Minimal Labeled Data [54.69489315952524]
"Prompt" is designed to improve the few-shot prompting capabilities of Text-to-LLMs.
"Prompt" outperforms previous approaches for in-context learning with few labeled data by a large margin.
We show that emphPrompt outperforms previous approaches for in-context learning with few labeled data by a large margin.
arXiv Detail & Related papers (2023-11-06T05:24:06Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems.
It is composed of publicly available text-to-domain datasets and 29K databases.
Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z) - Prompting GPT-3.5 for Text-to-SQL with De-semanticization and Skeleton
Retrieval [17.747079214502673]
Text-to- is a task that converts a natural language question into a structured query language () to retrieve information from a database.
In this paper, we propose an LLM-based framework for Text-to- which retrieves helpful demonstration examples to prompt LLMs.
We design a de-semanticization mechanism that extracts question skeletons, allowing us to retrieve similar examples based on their structural similarity.
arXiv Detail & Related papers (2023-04-26T06:02:01Z) - Divide and Prompt: Chain of Thought Prompting for Text-to-SQL [0.03807314298073299]
Chain-of-thought (CoT) prompting combined with large language models (LLMs) have achieved encouraging results on complex reasoning tasks.
We propose Divide-and-Prompt, which first divides the task into subtasks, and then approach each subtask through CoT.
arXiv Detail & Related papers (2023-04-23T06:52:35Z) - A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future
Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases.
Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z) - Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR.
Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries.
Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.