Related papers: Retrieval-augmented GPT-3.5-based Text-to-SQL Framework with Sample-aware Prompting and Dynamic Revision Chain

Retrieval-augmented GPT-3.5-based Text-to-SQL Framework with Sample-aware Prompting and Dynamic Revision Chain

URL: http://arxiv.org/abs/2307.05074v2
Date: Mon, 4 Sep 2023 08:10:03 GMT
Title: Retrieval-augmented GPT-3.5-based Text-to-SQL Framework with Sample-aware Prompting and Dynamic Revision Chain
Authors: Chunxi Guo, Zhiliang Tian, Jintao Tang, Shasha Li, Zhihua Wen, Kaixuan Wang and Ting Wang
Abstract summary: We propose a Text-to-aware prompting framework, involving a sample and a dynamic revision chain. Our approach incorporates sample demonstrations and fine-grained information related to the given question. To generate executable and accuratesqls without human intervention, we design a dynamic revision chain which iteratively adapts fine-grained feedback.
Score: 21.593701177605652
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-to-SQL aims at generating SQL queries for the given natural language questions and thus helping users to query databases. Prompt learning with large language models (LLMs) has emerged as a recent approach, which designs prompts to lead LLMs to understand the input question and generate the corresponding SQL. However, it faces challenges with strict SQL syntax requirements. Existing work prompts the LLMs with a list of demonstration examples (i.e. question-SQL pairs) to generate SQL, but the fixed prompts can hardly handle the scenario where the semantic gap between the retrieved demonstration and the input question is large. In this paper, we propose a retrieval-augmented prompting method for a LLM-based Text-to-SQL framework, involving sample-aware prompting and a dynamic revision chain. Our approach incorporates sample-aware demonstrations, which include the composition of SQL operators and fine-grained information related to the given question. To retrieve questions sharing similar intents with input questions, we propose two strategies for assisting retrieval. Firstly, we leverage LLMs to simplify the original questions, unifying the syntax and thereby clarifying the users' intentions. To generate executable and accurate SQLs without human intervention, we design a dynamic revision chain which iteratively adapts fine-grained feedback from the previously generated SQL. Experimental results on three Text-to-SQL benchmarks demonstrate the superiority of our method over strong baseline models.

Related papers

Weaver: Interweaving SQL and LLM for Table Reasoning [63.09519234853953]
Weaver generates a flexible, step-by-step plan that combinessql for structured data retrieval with LLMs for semantic processing.<n>Weaver consistently outperforms state-of-the-art methods across four TableQA datasets, reducing both API calls and error rates.
arXiv Detail & Related papers (2025-05-25T03:27:37Z)
RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL [48.516004807486745]
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to- task. We propose RB-, a novel retrieval-based framework for in-context prompt engineering. Experiment results demonstrate that our model achieves better performance than several competitive baselines on public datasets BIRD and Spider.
arXiv Detail & Related papers (2024-07-11T08:19:58Z)
CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions [22.493487741249716]
Large Language Models (LLMs) have been demonstrated to possess impressive capabilities in a variety of domains and tasks. We investigate the issue of prompt design in the multi-turn text-to- task and attempt to enhance the LLMs' reasoning capacity.
arXiv Detail & Related papers (2024-05-04T16:56:14Z)
PET-SQL: A Prompt-Enhanced Two-Round Refinement of Text-to-SQL with Cross-consistency [19.067737007347613]
Methods achieve new SOTA results on the Spider benchmark, with an execution accuracy of 87.6%. Our methods achieve new SOTA results on the Spider benchmark, with an execution accuracy of 87.6%.
arXiv Detail & Related papers (2024-03-13T02:32:41Z)
Structure Guided Large Language Model for SQL Generation [14.079764882536077]
We propose a novel structure-aware text-to- query and framework(SGU)<n>SGU-aware text-to- query and framework(SGU) consistently outperforms state-of-the-art text-to-models.
arXiv Detail & Related papers (2024-02-19T09:07:59Z)
Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM [15.888784472807775]
Existing methods rely on the comprehensive capability of large language models (LLMs) to generate queries. We propose the Knowledge-to- Data Expert framework, which employs tailored knowledge for all text-to- models.
arXiv Detail & Related papers (2024-02-18T09:10:04Z)
SQLPrompt: In-Context Text-to-SQL with Minimal Labeled Data [54.69489315952524]
"Prompt" is designed to improve the few-shot prompting capabilities of Text-to-LLMs. "Prompt" outperforms previous approaches for in-context learning with few labeled data by a large margin. We show that emphPrompt outperforms previous approaches for in-context learning with few labeled data by a large margin.
arXiv Detail & Related papers (2023-11-06T05:24:06Z)
SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs) With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses. With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z)
UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems. It is composed of publicly available text-to-domain datasets and 29K databases. Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z)
Prompting GPT-3.5 for Text-to-SQL with De-semanticization and Skeleton Retrieval [17.747079214502673]
Text-to- is a task that converts a natural language question into a structured query language () to retrieve information from a database. In this paper, we propose an LLM-based framework for Text-to- which retrieves helpful demonstration examples to prompt LLMs. We design a de-semanticization mechanism that extracts question skeletons, allowing us to retrieve similar examples based on their structural similarity.
arXiv Detail & Related papers (2023-04-26T06:02:01Z)
Divide and Prompt: Chain of Thought Prompting for Text-to-SQL [0.03807314298073299]
Chain-of-thought (CoT) prompting combined with large language models (LLMs) have achieved encouraging results on complex reasoning tasks. We propose Divide-and-Prompt, which first divides the task into subtasks, and then approach each subtask through CoT.
arXiv Detail & Related papers (2023-04-23T06:52:35Z)
A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases. Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z)
Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR. Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries. Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.