Fine-Tuning Language Models for Context-Specific SQL Query Generation
- URL: http://arxiv.org/abs/2312.02251v1
- Date: Mon, 4 Dec 2023 18:04:27 GMT
- Title: Fine-Tuning Language Models for Context-Specific SQL Query Generation
- Authors: Amine Rebei
- Abstract summary: This paper presents a novel approach to fine-tuning open-source large language models (LLMs) for the task of transforming natural language intosql queries.
We introduce models specialized in generatingsql queries, trained on synthetic datasets tailored to the Snowflake SQL and Google dialects.
Our methodology involves generating a context-specific dataset using GPT-4, then fine-tuning three open-source LLMs(Starcoder Plus, Code-Llama, and Mistral) employing the LoRa technique to optimize for resource constraints.
The fine-tuned models demonstrate superior performance in zero-shot settings compared to the baseline GP
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The ability to generate SQL queries from natural language has significant
implications for making data accessible to non-specialists. This paper presents
a novel approach to fine-tuning open-source large language models (LLMs) for
the task of transforming natural language into SQL queries within the retail
domain. We introduce models specialized in generating SQL queries, trained on
synthetic datasets tailored to the Snowflake SQL and GoogleSQL dialects. Our
methodology involves generating a context-specific dataset using GPT-4, then
fine-tuning three open-source LLMs(Starcoder Plus, Code-Llama, and Mistral)
employing the LoRa technique to optimize for resource constraints. The
fine-tuned models demonstrate superior performance in zero-shot settings
compared to the baseline GPT-4, with Code-Llama achieving the highest accuracy
rates, at 81.58% for Snowflake SQL and 82.66% for GoogleSQL. These results
underscore the effectiveness of fine-tuning LLMs on domain-specific tasks and
suggest a promising direction for enhancing the accessibility of relational
databases through natural language interfaces.
Related papers
- Enhancing LLM Fine-tuning for Text-to-SQLs by SQL Quality Measurement [1.392448435105643]
Text-to-s enables non-expert users to effortlessly retrieve desired information from databases using natural language queries.
Current state-of-the-art (SOTA) models like GPT4 and T5 have shown impressive performance on large-scale benchmarks like BIRD.
This paper proposed a novel approach that only needs SQL Quality to enhance Text-to-s performance.
arXiv Detail & Related papers (2024-10-02T17:21:51Z) - DataGpt-SQL-7B: An Open-Source Language Model for Text-to-SQL [7.76068876576964]
We propose a suite of compact, fine-tuned models and self-refine mechanisms to democratize data access and analysis for non-expert users.
Our system, DataGpt-sql, achieved 87.2% accuracy on the spider-dev.
arXiv Detail & Related papers (2024-09-24T11:38:08Z) - SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging [30.306023265985658]
We introduce a framework for generating high-quality synthetic training data for any dialect.
We propose a novel Mixture-of-Experts (MoE) that leverages the shared knowledge across dialects.
arXiv Detail & Related papers (2024-08-22T20:50:48Z) - Synthesizing Text-to-SQL Data from Weak and Strong LLMs [68.69270834311259]
The capability gap between open-source and closed-source large language models (LLMs) remains a challenge in text-to- tasks.
We introduce a synthetic data approach that combines data produced by larger, more powerful models with error information data generated by smaller, not well-aligned models.
arXiv Detail & Related papers (2024-08-06T15:40:32Z) - Relational Database Augmented Large Language Model [59.38841050766026]
Large language models (LLMs) excel in many natural language processing (NLP) tasks.
They can only incorporate new knowledge through training or supervised fine-tuning processes.
This precise, up-to-date, and private information is typically stored in relational databases.
arXiv Detail & Related papers (2024-07-21T06:19:10Z) - SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy [24.919119901664843]
This paper introduces a robust system integrating open-source Large Language Models (LLMs) with a suite of tools to enhance query accuracy and usability.
demonstrated by its leading performance on the Spider Leaderboard and deployment by Ant Group.
arXiv Detail & Related papers (2024-07-19T06:01:57Z) - RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL [48.516004807486745]
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to- task.
We propose RB-, a novel retrieval-based framework for in-context prompt engineering.
Experiment results demonstrate that our model achieves better performance than several competitive baselines on public datasets BIRD and Spider.
arXiv Detail & Related papers (2024-07-11T08:19:58Z) - Optimizing LLM Queries in Relational Workloads [58.254894049950366]
We show how to optimize Large Language Models (LLMs) inference for analytical workloads that invoke LLMs within relational queries.
We implement these optimizations in Apache Spark, with vLLM as the model serving backend.
We achieve up to 4.4x improvement in end-to-end latency on a benchmark of diverse LLM-based queries on real datasets.
arXiv Detail & Related papers (2024-03-09T07:01:44Z) - SQLPrompt: In-Context Text-to-SQL with Minimal Labeled Data [54.69489315952524]
"Prompt" is designed to improve the few-shot prompting capabilities of Text-to-LLMs.
"Prompt" outperforms previous approaches for in-context learning with few labeled data by a large margin.
We show that emphPrompt outperforms previous approaches for in-context learning with few labeled data by a large margin.
arXiv Detail & Related papers (2023-11-06T05:24:06Z) - Interleaving Pre-Trained Language Models and Large Language Models for
Zero-Shot NL2SQL Generation [23.519727682763644]
ZeroNL2 is crucial in achieving natural language tosql that is adaptive to new environments.
Existing approaches either fine-tune pretrained language models (PLMs) based on data or use prompts to guide fixed large language models (LLMs) such as ChatGPT.
We propose a ZeroNL2 framework that combines the complementary advantages of PLMs and LLMs for supporting zero-shot NL2.
arXiv Detail & Related papers (2023-06-15T06:50:51Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.