PURPLE: Making a Large Language Model a Better SQL Writer
- URL: http://arxiv.org/abs/2403.20014v1
- Date: Fri, 29 Mar 2024 07:01:29 GMT
- Title: PURPLE: Making a Large Language Model a Better SQL Writer
- Authors: Tonghui Ren, Yuankai Fan, Zhenying He, Ren Huang, Jiaqi Dai, Can Huang, Yinan Jing, Kai Zhang, Yifan Yang, X. Sean Wang,
- Abstract summary: We propose PURPLE, which improves accuracy by retrieving demonstrations containing the requisite logical operator composition for the NL2 task.
PURPLE achieves a new state-of-the-art performance of 80.5% exact-set match accuracy and 87.8% execution match accuracy on the validation set of the popular NL2 benchmark.
- Score: 14.627323505405327
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Model (LLM) techniques play an increasingly important role in Natural Language to SQL (NL2SQL) translation. LLMs trained by extensive corpora have strong natural language understanding and basic SQL generation abilities without additional tuning specific to NL2SQL tasks. Existing LLMs-based NL2SQL approaches try to improve the translation by enhancing the LLMs with an emphasis on user intention understanding. However, LLMs sometimes fail to generate appropriate SQL due to their lack of knowledge in organizing complex logical operator composition. A promising method is to input the LLMs with demonstrations, which include known NL2SQL translations from various databases. LLMs can learn to organize operator compositions from the input demonstrations for the given task. In this paper, we propose PURPLE (Pre-trained models Utilized to Retrieve Prompts for Logical Enhancement), which improves accuracy by retrieving demonstrations containing the requisite logical operator composition for the NL2SQL task on hand, thereby guiding LLMs to produce better SQL translation. PURPLE achieves a new state-of-the-art performance of 80.5% exact-set match accuracy and 87.8% execution match accuracy on the validation set of the popular NL2SQL benchmark Spider. PURPLE maintains high accuracy across diverse benchmarks, budgetary constraints, and various LLMs, showing robustness and cost-effectiveness.
Related papers
- Relational Database Augmented Large Language Model [59.38841050766026]
Large language models (LLMs) excel in many natural language processing (NLP) tasks.
They can only incorporate new knowledge through training or supervised fine-tuning processes.
This precise, up-to-date, and private information is typically stored in relational databases.
arXiv Detail & Related papers (2024-07-21T06:19:10Z) - MindMerger: Efficient Boosting LLM Reasoning in non-English Languages [26.334092384176518]
Reasoning capabilities are crucial for Large Language Models (LLMs)
We propose MindMerger, which merges LLMs with the external language understanding capabilities from multilingual models.
MindMerger consistently outperforms all baselines, especially in low-resource languages.
arXiv Detail & Related papers (2024-05-27T17:41:54Z) - Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners [67.85635044939836]
Large Language Models (LLMs) have shown impressive language capabilities.
In this work, we investigate the spontaneous multilingual alignment improvement of LLMs.
We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages.
arXiv Detail & Related papers (2024-05-22T16:46:19Z) - PET-SQL: A Prompt-Enhanced Two-Round Refinement of Text-to-SQL with Cross-consistency [19.067737007347613]
Methods achieve new SOTA results on the Spider benchmark, with an execution accuracy of 87.6%.
Our methods achieve new SOTA results on the Spider benchmark, with an execution accuracy of 87.6%.
arXiv Detail & Related papers (2024-03-13T02:32:41Z) - PPTC-R benchmark: Towards Evaluating the Robustness of Large Language
Models for PowerPoint Task Completion [96.47420221442397]
We construct adversarial user instructions by attacking user instructions at sentence, semantic, and multi-language levels.
We test 3 closed-source and 4 open-source LLMs using a benchmark that incorporates robustness settings.
We find that GPT-4 exhibits the highest performance and strong robustness in our benchmark.
arXiv Detail & Related papers (2024-03-06T15:33:32Z) - Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM [15.888784472807775]
Existing methods rely on the comprehensive capability of large language models (LLMs) to generate queries.
We propose the Knowledge-to- Data Expert framework, which employs tailored knowledge for all text-to- models.
arXiv Detail & Related papers (2024-02-18T09:10:04Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation [76.76046657162306]
Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
arXiv Detail & Related papers (2023-08-29T14:59:54Z) - Interleaving Pre-Trained Language Models and Large Language Models for
Zero-Shot NL2SQL Generation [23.519727682763644]
ZeroNL2 is crucial in achieving natural language tosql that is adaptive to new environments.
Existing approaches either fine-tune pretrained language models (PLMs) based on data or use prompts to guide fixed large language models (LLMs) such as ChatGPT.
We propose a ZeroNL2 framework that combines the complementary advantages of PLMs and LLMs for supporting zero-shot NL2.
arXiv Detail & Related papers (2023-06-15T06:50:51Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.