Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A
Study on Prompt Design Strategies
- URL: http://arxiv.org/abs/2305.12586v1
- Date: Sun, 21 May 2023 22:44:25 GMT
- Title: Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A
Study on Prompt Design Strategies
- Authors: Linyong Nan, Yilun Zhao, Weijin Zou, Narutatsu Ri, Jaesung Tae, Ellen
Zhang, Arman Cohan, Dragomir Radev
- Abstract summary: In-context learning (ICL) has emerged as a new approach to various natural language processing tasks.
In this paper, we aim to extend this method to question answering tasks that utilize structured knowledge sources.
- Score: 20.15851744895469
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In-context learning (ICL) has emerged as a new approach to various natural
language processing tasks, utilizing large language models (LLMs) to make
predictions based on context that has been supplemented with a few examples or
task-specific instructions. In this paper, we aim to extend this method to
question answering tasks that utilize structured knowledge sources, and improve
Text-to-SQL systems by exploring various prompt design strategies for employing
LLMs. We conduct a systematic investigation into different demonstration
selection methods and optimal instruction formats for prompting LLMs in the
Text-to-SQL task. Our approach involves leveraging the syntactic structure of
an example's SQL query to retrieve demonstrations, and we demonstrate that
pursuing both diversity and similarity in demonstration selection leads to
enhanced performance. Furthermore, we show that LLMs benefit from
database-related knowledge augmentations. Our most effective strategy
outperforms the state-of-the-art system by 2.5 points (Execution Accuracy) and
the best fine-tuned system by 5.1 points on the Spider dataset. These results
highlight the effectiveness of our approach in adapting LLMs to the Text-to-SQL
task, and we present an analysis of the factors contributing to the success of
our strategy.
Related papers
- Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning [79.38140606606126]
We propose an algorithmic framework that fine-tunes vision-language models (VLMs) with reinforcement learning (RL)
Our framework provides a task description and then prompts the VLM to generate chain-of-thought (CoT) reasoning.
We demonstrate that our proposed framework enhances the decision-making capabilities of VLM agents across various tasks.
arXiv Detail & Related papers (2024-05-16T17:50:19Z) - How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning? [11.374310255084753]
We introduce a novel supervised MLLM-retriever MSIER that employs a neural network to select examples that enhance multimodal in-context learning efficiency.
This approach is validated through extensive testing across three distinct tasks, demonstrating the method's effectiveness.
This exploration paves the way for future advancements, highlighting the potential for refined in-context learning in MLLMs through the strategic use of multimodal data.
arXiv Detail & Related papers (2024-04-19T13:05:37Z) - Benchmarking the Text-to-SQL Capability of Large Language Models: A
Comprehensive Evaluation [33.41556606816004]
Large Language Models (LLMs) have emerged as a powerful tool in advancing the Text-to- task.
There is still no consensus on the optimal prompt templates and design frameworks.
Existing benchmarks inadequately explore the performance of LLMs across the various sub-tasks of the Text-to- process.
arXiv Detail & Related papers (2024-03-05T13:23:48Z) - Meta-Task Prompting Elicits Embeddings from Large Language Models [54.757445048329735]
We introduce a new unsupervised text embedding method, Meta-Task Prompting with Explicit One-Word Limitation.
We generate high-quality sentence embeddings from Large Language Models without the need for model fine-tuning.
Our findings suggest a new scaling law, offering a versatile and resource-efficient approach for embedding generation across diverse scenarios.
arXiv Detail & Related papers (2024-02-28T16:35:52Z) - Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation [76.76046657162306]
Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
arXiv Detail & Related papers (2023-08-29T14:59:54Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain,
and Cross-domain Settings [12.288808992805494]
Large language models (LLMs) with in-context learning have demonstrated remarkable capability in the text-to- task.
Previous research has prompted LLMs with various demonstration-retrieval strategies and intermediate reasoning steps to enhance their performance.
arXiv Detail & Related papers (2023-05-19T17:43:58Z) - ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for
Document Information Extraction [56.790794611002106]
Large language models (LLMs) have demonstrated remarkable results in various natural language processing (NLP) tasks with in-context learning.
We propose a simple but effective in-context learning framework called ICL-D3IE.
Specifically, we extract the most difficult and distinct segments from hard training documents as hard demonstrations.
arXiv Detail & Related papers (2023-03-09T06:24:50Z) - CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented
Dialog Systems [56.302581679816775]
This paper proposes Comprehensive Instruction (CINS) that exploits PLMs with task-specific instructions.
We design a schema (definition, constraint, prompt) of instructions and their customized realizations for three important downstream tasks in ToD.
Experiments are conducted on these ToD tasks in realistic few-shot learning scenarios with small validation data.
arXiv Detail & Related papers (2021-09-10T03:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.