Related papers: LR-SQL: A Supervised Fine-Tuning Method for Text2SQL Tasks under Low-Resource Scenarios

LR-SQL: A Supervised Fine-Tuning Method for Text2SQL Tasks under Low-Resource Scenarios

URL: http://arxiv.org/abs/2410.11457v1
Date: Tue, 15 Oct 2024 10:02:55 GMT
Title: LR-SQL: A Supervised Fine-Tuning Method for Text2SQL Tasks under Low-Resource Scenarios
Authors: Wen Wuzhenghong, Zhang Yongpan, Pan Su, Sun Yuwei, Lu Pengwei, Ding Cheng,
Abstract summary: Large language models revolutionize Text2 through supervised fine-tuning. Yet a crucial limitation is overlooked: the complexity of databases leads to an increased context length. We propose LR-Thought to reduce total GPU memory usage by 40% compared to existing fine-tuning methods.
Score: 1.4387218083918762
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models revolutionize Text2SQL through supervised fine-tuning, yet a crucial limitation is overlooked: the complexity of databases leads to an increased context length, consequently resulting in higher GPU memory demands for model fine-tuning. To address this issue, we propose LR-SQL. LR-SQL comprises two supervised fine-tuning models: the schema\_link model and the SQL\_generation model, with the schema\_link model serving as the focal point for streamlining the overall process. During the fine-tuning of the schema\_link model, LR-SQL breaks down the complete database into flexible combinations of tables with adjustable quantities, enabling the model to learn the relationships within the entire database from these dispersed slices. Furthermore, to enhance the model's ability to perceive the relationships among various discrete slices during inference, LR-SQL trains the model's Chain-of-Thought capability for this task. Experimental results demonstrate that LR-SQL can reduce the total GPU memory usage by 40\% compared to existing fine-tuning methods, while only losing 2\% of table prediction accuracy in schema\_link task. For the overall Text2SQL task, the Execution Accuracy decrease by 0.6\%.Our project is now available on https://github.com/hongWin/LR-SQL

Related papers

Feather-SQL: A Lightweight NL2SQL Framework with Dual-Model Collaboration Paradigm for Small Language Models [22.960560371494832]
Small language models (SLMs) struggle with NL2 tasks, exhibiting poor performance and incompatibility with existing frameworks. We introduce Feather- Paradigm, a new lightweight framework tailored for SLMs. The proposed paradigm raises the accuracy ceiling of SLMs to 54.76%, highlighting its effectiveness.
arXiv Detail & Related papers (2025-03-22T16:22:53Z)
Knapsack Optimization-based Schema Linking for LLM-based Text-to-SQL Generation [15.888784472807775]
We introduce Knapsack optimization-based Linking Agent (KaSLA) KaSLA is a plug-in schema linking agent designed to prevent the missing of relevant schema elements while minimizing the inclusion of redundant ones. Experiments on Spider and BIRD benchmarks verify that KaSLA can significantly improve the generation performance of SOTA models.
arXiv Detail & Related papers (2025-02-18T14:53:45Z)
Extractive Schema Linking for Text-to-SQL [17.757832644216446]
Text-to-one is emerging as a practical interface for real world databases. We introduce a new approach to adapt decoder-only LLMs to schema linking.
arXiv Detail & Related papers (2025-01-23T19:57:08Z)
RSL-SQL: Robust Schema Linking in Text-to-SQL Generation [51.00761167842468]
We propose a novel framework called RSL- that combines bidirectional schema linking, contextual information augmentation, binary selection strategy, and multi-turn self-correction. benchmarks demonstrate that our approach achieves SOTA execution accuracy among open-source solutions, with 67.2% on BIRD and 87.9% on GPT-4ocorrection. Our approach outperforms a series of GPT-4 based Text-to-Seek systems when adopting DeepSeek (much cheaper) with same intact prompts.
arXiv Detail & Related papers (2024-10-31T16:22:26Z)
MAG-SQL: Multi-Agent Generative Approach with Soft Schema Linking and Iterative Sub-SQL Refinement for Text-to-SQL [15.824894030016187]
Recent In-Context Learning based methods have achieved remarkable success in Text-to-Context task. There is still a large gap between the performance of these models and human performance on datasets with complex database schema and difficult questions, such as. In our framework, an entity-based method with tables' summary is used to select the columns in database, and a novel targets-conditions decomposition method is introduced to decompose those complex questions.
arXiv Detail & Related papers (2024-08-15T04:57:55Z)
The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models [0.9149661171430259]
We revisit schema linking when using the latest generation of large language models (LLMs) We find empirically that newer models are adept at utilizing relevant schema elements during generation even in the presence of large numbers of irrelevant ones. Instead of filtering contextual information, we highlight techniques such as augmentation, selection, and correction, and adopt them to improve the accuracy of our Text-to-BIRD pipeline.
arXiv Detail & Related papers (2024-08-14T17:59:04Z)
Synthesizing Text-to-SQL Data from Weak and Strong LLMs [68.69270834311259]
The capability gap between open-source and closed-source large language models (LLMs) remains a challenge in text-to- tasks. We introduce a synthetic data approach that combines data produced by larger, more powerful models with error information data generated by smaller, not well-aligned models.
arXiv Detail & Related papers (2024-08-06T15:40:32Z)
RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL [48.516004807486745]
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to- task. We propose RB-, a novel retrieval-based framework for in-context prompt engineering. Experiment results demonstrate that our model achieves better performance than several competitive baselines on public datasets BIRD and Spider.
arXiv Detail & Related papers (2024-07-11T08:19:58Z)
Blar-SQL: Faster, Stronger, Smaller NL2SQL [0.0]
We show how task decomposition can greatly benefit Large Language Models (LLMs) in database understanding and query generation. We propose a new framework to divide the schema into chunks in order to fit more information into a limited context. Our results are comparable with those obtained by GPT-4 at the same time being 135 times smaller, 90 times faster and more than 100 times cheaper than GPT-4.
arXiv Detail & Related papers (2024-01-04T16:50:52Z)
Wav2SQL: Direct Generalizable Speech-To-SQL Parsing [55.10009651476589]
Speech-to-Spider (S2Spider) aims to convert spoken questions intosql queries given databases. We propose the first direct speech-to-speaker parsing model Wav2 which avoids error compounding across cascaded systems. Experimental results demonstrate that Wav2 avoids error compounding and achieves state-of-the-art results by up to 2.5% accuracy improvement over the baseline.
arXiv Detail & Related papers (2023-05-21T19:26:46Z)
Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing [66.55478402233399]
We propose a framework to elicit relational structures via a probing procedure based on Poincar'e distance metric. Compared with commonly-used rule-based methods for schema linking, we found that probing relations can robustly capture semantic correspondences. Our framework sets new state-of-the-art performance on three benchmarks.
arXiv Detail & Related papers (2022-06-28T14:05:25Z)
Leveraging Table Content for Zero-shot Text-to-SQL with Meta-Learning [25.69875174742935]
Single-table text-to-one aims to transform a natural language question into a query according to one single table. We propose a new approach for the zero-shot text-to-one task which does not rely on any additional manual annotations. We conduct extensive experiments on a public open-domain text-to-one dataset and a domain-specific dataset E.
arXiv Detail & Related papers (2021-09-12T01:01:28Z)
IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation [61.09660709356527]
We propose a database schema interaction graph encoder to utilize historicalal information of database schema items. We evaluate our model on the benchmark SParC and Co datasets.
arXiv Detail & Related papers (2020-11-11T12:56:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.