Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A
Study on Prompt Design Strategies
- URL: http://arxiv.org/abs/2305.12586v1
- Date: Sun, 21 May 2023 22:44:25 GMT
- Title: Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A
Study on Prompt Design Strategies
- Authors: Linyong Nan, Yilun Zhao, Weijin Zou, Narutatsu Ri, Jaesung Tae, Ellen
Zhang, Arman Cohan, Dragomir Radev
- Abstract summary: In-context learning (ICL) has emerged as a new approach to various natural language processing tasks.
In this paper, we aim to extend this method to question answering tasks that utilize structured knowledge sources.
- Score: 20.15851744895469
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In-context learning (ICL) has emerged as a new approach to various natural
language processing tasks, utilizing large language models (LLMs) to make
predictions based on context that has been supplemented with a few examples or
task-specific instructions. In this paper, we aim to extend this method to
question answering tasks that utilize structured knowledge sources, and improve
Text-to-SQL systems by exploring various prompt design strategies for employing
LLMs. We conduct a systematic investigation into different demonstration
selection methods and optimal instruction formats for prompting LLMs in the
Text-to-SQL task. Our approach involves leveraging the syntactic structure of
an example's SQL query to retrieve demonstrations, and we demonstrate that
pursuing both diversity and similarity in demonstration selection leads to
enhanced performance. Furthermore, we show that LLMs benefit from
database-related knowledge augmentations. Our most effective strategy
outperforms the state-of-the-art system by 2.5 points (Execution Accuracy) and
the best fine-tuned system by 5.1 points on the Spider dataset. These results
highlight the effectiveness of our approach in adapting LLMs to the Text-to-SQL
task, and we present an analysis of the factors contributing to the success of
our strategy.
Related papers
- Towards a Unified View of Preference Learning for Large Language Models: A Survey [88.66719962576005]
Large Language Models (LLMs) exhibit remarkably powerful capabilities.
One of the crucial factors to achieve success is aligning the LLM's output with human preferences.
We decompose all the strategies in preference learning into four components: model, data, feedback, and algorithm.
arXiv Detail & Related papers (2024-09-04T15:11:55Z) - QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning [58.767866109043055]
We introduce Query-dependent Prompt Optimization (QPO), which iteratively fine-tune a small pretrained language model to generate optimal prompts tailored to the input queries.
We derive insights from offline prompting demonstration data, which already exists in large quantities as a by-product of benchmarking diverse prompts on open-sourced tasks.
Experiments on various LLM scales and diverse NLP and math tasks demonstrate the efficacy and cost-efficiency of our method in both zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2024-08-20T03:06:48Z) - Large Language Models Know What Makes Exemplary Contexts [42.90814615222177]
In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs)
This paper presents a unified framework for LLMs that allows them to self-select influential in-context examples to compose their contexts.
arXiv Detail & Related papers (2024-08-14T12:32:41Z) - Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs)
We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios.
We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z) - Benchmarking the Text-to-SQL Capability of Large Language Models: A
Comprehensive Evaluation [33.41556606816004]
Large Language Models (LLMs) have emerged as a powerful tool in advancing the Text-to- task.
There is still no consensus on the optimal prompt templates and design frameworks.
Existing benchmarks inadequately explore the performance of LLMs across the various sub-tasks of the Text-to- process.
arXiv Detail & Related papers (2024-03-05T13:23:48Z) - Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation [76.76046657162306]
Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
arXiv Detail & Related papers (2023-08-29T14:59:54Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain,
and Cross-domain Settings [12.288808992805494]
Large language models (LLMs) with in-context learning have demonstrated remarkable capability in the text-to- task.
Previous research has prompted LLMs with various demonstration-retrieval strategies and intermediate reasoning steps to enhance their performance.
arXiv Detail & Related papers (2023-05-19T17:43:58Z) - ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for
Document Information Extraction [56.790794611002106]
Large language models (LLMs) have demonstrated remarkable results in various natural language processing (NLP) tasks with in-context learning.
We propose a simple but effective in-context learning framework called ICL-D3IE.
Specifically, we extract the most difficult and distinct segments from hard training documents as hard demonstrations.
arXiv Detail & Related papers (2023-03-09T06:24:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.