A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?
- URL: http://arxiv.org/abs/2408.05109v3
- Date: Wed, 04 Dec 2024 04:57:04 GMT
- Title: A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?
- Authors: Xinyu Liu, Shuyu Shen, Boyan Li, Peixian Ma, Runzhi Jiang, Yuxin Zhang, Ju Fan, Guoliang Li, Nan Tang, Yuyu Luo,
- Abstract summary: We provide a review of NL2 techniques powered by Large Language Models (LLMs)
We discuss the research challenges and open problems of NL2 in the LLMs era.
- Score: 32.84561352339466
- License:
- Abstract: Translating users' natural language queries (NL) into SQL queries (i.e., NL2SQL, a.k.a., Text-to-SQL) can significantly reduce barriers to accessing relational databases and support various commercial applications. The performance of NL2SQL has been greatly enhanced with the emergence of Large Language Models (LLMs). In this survey, we provide a comprehensive review of NL2SQL techniques powered by LLMs, covering its entire lifecycle from the following four aspects: (1) Model: NL2SQL translation techniques that tackle not only NL ambiguity and under-specification, but also properly map NL with database schema and instances; (2) Data: From the collection of training data, data synthesis due to training data scarcity, to NL2SQL benchmarks; (3) Evaluation: Evaluating NL2SQL methods from multiple angles using different metrics and granularities; and (4) Error Analysis: analyzing NL2SQL errors to find the root cause and guiding NL2SQL models to evolve. Moreover, we provide a rule of thumb for developing NL2SQL solutions. Finally, we discuss the research challenges and open problems of NL2SQL in the LLMs era.
Related papers
- Grounding Natural Language to SQL Translation with Data-Based Self-Explanations [7.4643285253289475]
Cycle is a framework designed for end-to-end translation models to autonomously generate the best output through self-evaluation.
The main idea is to introduce data-grounded NL explanations as self-provided feedback, and use the feedback to validate the correctness of translation.
The results show that 1) the feedback loop introduced in Cycle can consistently improve the performance of existing models, and in particular, by applying Cycle to RESD, obtains a translation accuracy of 82.0% (+2.6%) on the validation set, and 81.6% (+3.2%) on the test set benchmark.
arXiv Detail & Related papers (2024-11-05T09:44:53Z) - RSL-SQL: Robust Schema Linking in Text-to-SQL Generation [51.00761167842468]
We propose a novel framework called RSL- that combines bidirectional schema linking, contextual information augmentation, binary selection strategy, and multi-turn self-correction.
benchmarks demonstrate that our approach achieves SOTA execution accuracy among open-source solutions, with 67.2% on BIRD and 87.9% on GPT-4ocorrection.
Our approach outperforms a series of GPT-4 based Text-to-Seek systems when adopting DeepSeek (much cheaper) with same intact prompts.
arXiv Detail & Related papers (2024-10-31T16:22:26Z) - PURPLE: Making a Large Language Model a Better SQL Writer [14.627323505405327]
We propose PURPLE, which improves accuracy by retrieving demonstrations containing the requisite logical operator composition for the NL2 task.
PURPLE achieves a new state-of-the-art performance of 80.5% exact-set match accuracy and 87.8% execution match accuracy on the validation set of the popular NL2 benchmark.
arXiv Detail & Related papers (2024-03-29T07:01:29Z) - Data Transformation to Construct a Dataset for Generating
Entity-Relationship Model from Natural Language [39.53954130028595]
In order to reduce the manual cost of ER models, recent approaches have been proposed to address the task of NL2ERM.
These approaches are typically rule-based ones, which rely on rigid rules.
Despite having better generalization than rule-based approaches, deep-based models are lacking for NL2ERM due to lacking a large-scale dataset.
arXiv Detail & Related papers (2023-12-21T09:45:13Z) - Interleaving Pre-Trained Language Models and Large Language Models for
Zero-Shot NL2SQL Generation [23.519727682763644]
ZeroNL2 is crucial in achieving natural language tosql that is adaptive to new environments.
Existing approaches either fine-tune pretrained language models (PLMs) based on data or use prompts to guide fixed large language models (LLMs) such as ChatGPT.
We propose a ZeroNL2 framework that combines the complementary advantages of PLMs and LLMs for supporting zero-shot NL2.
arXiv Detail & Related papers (2023-06-15T06:50:51Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems.
It is composed of publicly available text-to-domain datasets and 29K databases.
Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z) - XRICL: Cross-lingual Retrieval-Augmented In-Context Learning for
Cross-lingual Text-to-SQL Semantic Parsing [70.40401197026925]
In-context learning using large language models has recently shown surprising results for semantic parsing tasks.
This work introduces the XRICL framework, which learns to retrieve relevant English exemplars for a given query.
We also include global translation exemplars for a target language to facilitate the translation process for large language models.
arXiv Detail & Related papers (2022-10-25T01:33:49Z) - Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR.
Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries.
Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z) - Relation Aware Semi-autoregressive Semantic Parsing for NL2SQL [17.605904256822786]
We present a Relation aware Semi-autogressive Semantic Parsing (MODN) framework, which is more adaptable for NL2 backbone.
From empirical results and case study, our model shows its effectiveness in learning better word representation in NL2.
arXiv Detail & Related papers (2021-08-02T12:21:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.