Related papers: Prompting GPT-3.5 for Text-to-SQL with De-semanticization and Skeleton Retrieval

Prompting GPT-3.5 for Text-to-SQL with De-semanticization and Skeleton Retrieval

URL: http://arxiv.org/abs/2304.13301v2
Date: Thu, 31 Aug 2023 15:24:36 GMT
Title: Prompting GPT-3.5 for Text-to-SQL with De-semanticization and Skeleton Retrieval
Authors: Chunxi Guo, Zhiliang Tian, Jintao Tang, Pancheng Wang, Zhihua Wen, Kang Yang and Ting Wang
Abstract summary: Text-to- is a task that converts a natural language question into a structured query language () to retrieve information from a database. In this paper, we propose an LLM-based framework for Text-to- which retrieves helpful demonstration examples to prompt LLMs. We design a de-semanticization mechanism that extracts question skeletons, allowing us to retrieve similar examples based on their structural similarity.
Score: 17.747079214502673
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-to-SQL is a task that converts a natural language question into a structured query language (SQL) to retrieve information from a database. Large language models (LLMs) work well in natural language generation tasks, but they are not specifically pre-trained to understand the syntax and semantics of SQL commands. In this paper, we propose an LLM-based framework for Text-to-SQL which retrieves helpful demonstration examples to prompt LLMs. However, questions with different database schemes can vary widely, even if the intentions behind them are similar and the corresponding SQL queries exhibit similarities. Consequently, it becomes crucial to identify the appropriate SQL demonstrations that align with our requirements. We design a de-semanticization mechanism that extracts question skeletons, allowing us to retrieve similar examples based on their structural similarity. We also model the relationships between question tokens and database schema items (i.e., tables and columns) to filter out scheme-related information. Our framework adapts the range of the database schema in prompts to balance length and valuable information. A fallback mechanism allows for a more detailed schema to be provided if the generated SQL query fails. Ours outperforms state-of-the-art models and demonstrates strong generalization ability on three cross-domain Text-to-SQL benchmarks.

Related papers

V-SQL: A View-based Two-stage Text-to-SQL Framework [0.9719868595277401]
Text-to-coupling methods based on large language models (LLMs) have garnered significant attention. The core of mainstream text-to-coupling frameworks is schema linking, which aligns user queries with relevant tables and columns in the database. Previous methods focused on schema linking while to enhance LLMs' understanding of database schema.
arXiv Detail & Related papers (2024-12-17T02:27:50Z)
SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy [24.919119901664843]
This paper introduces a robust system integrating open-source Large Language Models (LLMs) with a suite of tools to enhance query accuracy and usability. demonstrated by its leading performance on the Spider Leaderboard and deployment by Ant Group.
arXiv Detail & Related papers (2024-07-19T06:01:57Z)
Schema-Aware Multi-Task Learning for Complex Text-to-SQL [4.913409359995421]
We present a schema-aware multi-task learning framework (named MT) for complicatedsql queries. Specifically, we design a schema linking discriminator module to distinguish the valid question-schema linkings. On the decoder side, we define 6-type relationships to describe the connections between tables and columns.
arXiv Detail & Related papers (2024-03-09T01:13:37Z)
Structure Guided Large Language Model for SQL Generation [15.419227635308674]
We propose a structure-to- framework, which leverages the inherent structure information. SGU- links user queries and databases in a structure-enhanced manner. It then decomposes complicated structures with grammar trees to guide the LLM to generate the step by step.
arXiv Detail & Related papers (2024-02-19T09:07:59Z)
SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation [16.07396492960869]
We introduce a novel Transformer architecture specifically crafted to perform text-to-gressive translation tasks. Our model predicts queries as abstract syntax trees (ASTs) in an autore way, incorporating structural inductive bias in the executable and decoder layers.
arXiv Detail & Related papers (2023-10-27T00:13:59Z)
Retrieval-augmented GPT-3.5-based Text-to-SQL Framework with Sample-aware Prompting and Dynamic Revision Chain [21.593701177605652]
We propose a Text-to-aware prompting framework, involving a sample and a dynamic revision chain. Our approach incorporates sample demonstrations and fine-grained information related to the given question. To generate executable and accuratesqls without human intervention, we design a dynamic revision chain which iteratively adapts fine-grained feedback.
arXiv Detail & Related papers (2023-07-11T07:16:22Z)
SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs) With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses. With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z)
UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems. It is composed of publicly available text-to-domain datasets and 29K databases. Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z)
Towards Generalizable and Robust Text-to-SQL Parsing [77.18724939989647]
We propose a novel TKK framework consisting of Task decomposition, Knowledge acquisition, and Knowledge composition to learn text-to- parsing in stages. We show that our framework is effective in all scenarios and state-of-the-art performance on the Spider, SParC, and Co. datasets.
arXiv Detail & Related papers (2022-10-23T09:21:27Z)
A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases. Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z)
Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR. Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries. Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z)
Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering [78.9863753810787]
A large amount of world's knowledge is stored in structured databases. query languages can answer questions that require complex reasoning, as well as offering full explainability.
arXiv Detail & Related papers (2021-08-05T22:04:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.