DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword Extraction
- URL: http://arxiv.org/abs/2509.14507v1
- Date: Thu, 18 Sep 2025 00:47:56 GMT
- Title: DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword Extraction
- Authors: Jian Chen, Zhenyan Chen, Xuming Hu, Peilin Zhou, Yining Hua, Han Fang, Cissy Hing Yee Choy, Xinmei Ke, Jingfeng Luo, Zixuan Yuan,
- Abstract summary: We present DeKeyNLU, a novel dataset which contains 1,500 meticulously annotated QA pairs.<n>We propose DeKey, a RAG-based NL2 pipeline that employs three separate modules for user question understanding, entity retrieval, and generation.
- Score: 46.422626657078666
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Natural Language to SQL (NL2SQL) provides a new model-centric paradigm that simplifies database access for non-technical users by converting natural language queries into SQL commands. Recent advancements, particularly those integrating Retrieval-Augmented Generation (RAG) and Chain-of-Thought (CoT) reasoning, have made significant strides in enhancing NL2SQL performance. However, challenges such as inaccurate task decomposition and keyword extraction by LLMs remain major bottlenecks, often leading to errors in SQL generation. While existing datasets aim to mitigate these issues by fine-tuning models, they struggle with over-fragmentation of tasks and lack of domain-specific keyword annotations, limiting their effectiveness. To address these limitations, we present DeKeyNLU, a novel dataset which contains 1,500 meticulously annotated QA pairs aimed at refining task decomposition and enhancing keyword extraction precision for the RAG pipeline. Fine-tuned with DeKeyNLU, we propose DeKeySQL, a RAG-based NL2SQL pipeline that employs three distinct modules for user question understanding, entity retrieval, and generation to improve SQL generation accuracy. We benchmarked multiple model configurations within DeKeySQL RAG pipeline. Experimental results demonstrate that fine-tuning with DeKeyNLU significantly improves SQL generation accuracy on both BIRD (62.31% to 69.10%) and Spider (84.2% to 88.7%) dev datasets.
Related papers
- LLM-Based SQL Generation: Prompting, Self-Refinement, and Adaptive Weighted Majority Voting [7.590911146338215]
We propose a Single-Agent Self-Refinement with Ensemble Voting (SSEV)<n>We build on insights from the SSEV pipeline to address the growing complexity of enterprise databases and real-world Text-to-Act tasks.<n>ReCAPAgent-5.5% integrates specialized agents for planning, external knowledge retrieval, critique, action generation, self-refinement, schema linking, and result validation.
arXiv Detail & Related papers (2026-01-25T18:38:58Z) - Text-to-SQL as Dual-State Reasoning: Integrating Adaptive Context and Progressive Generation [54.53145282349042]
We introduce DSR-sourced, a textbfDual-textbfS textbfReasoning framework that models Text-to-context as an interaction between an adaptive context state and a progressive generation state.<n>Without any post-training or in-context examples, DSR-sourced achieves competitive performance, reaching 35.28% execution accuracy on Spider 2.0-Snow and 68.32% on BIRD development set.
arXiv Detail & Related papers (2025-11-26T13:52:50Z) - CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation [1.169202600932732]
We introduce Cogni-R1-Zero, a reinforcement learning (RL) framework and model.<n>We use a lightweight reward signal based on execution correctness and format-tag compliance.<n>Our method achieves state-of-the-art execution accuracy on Text2 benchmark.<n>To support further research in efficient and interpretable Text-to-code modeling, we release two curated datasets.
arXiv Detail & Related papers (2025-07-08T14:17:07Z) - RetrySQL: text-to-SQL training with retry data for self-correcting query generation [1.6707278580444538]
We introduce Retry, a new approach to training text-to-generation models.<n>We demonstrate that retry steps yield an improvement of up to 4 percentage points in both overall and challenging execution accuracy metrics.
arXiv Detail & Related papers (2025-07-03T11:00:49Z) - RAISE: Reasoning Agent for Interactive SQL Exploration [47.77323087050061]
We propose a novel framework that unifies schema linking, query generation, and iterative refinement within a single, end-to-end component.<n>Our method emulates how humans answer questions when working with unfamiliar databases.
arXiv Detail & Related papers (2025-06-02T03:07:08Z) - ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback [49.21833666405111]
Large language models (LLMs) excel in many reasoning tasks, but their ability to leverage Chain-of-Thought (CoT) reasoning remains underexplored.<n>We propose ExCoT, a novel framework that iteratively optimize open-source LLMs by combining CoT reasoning with off-policy and on-policy DPO.
arXiv Detail & Related papers (2025-03-25T18:17:36Z) - RSL-SQL: Robust Schema Linking in Text-to-SQL Generation [51.00761167842468]
We propose a novel framework called RSL- that combines bidirectional schema linking, contextual information augmentation, binary selection strategy, and multi-turn self-correction.
benchmarks demonstrate that our approach achieves SOTA execution accuracy among open-source solutions, with 67.2% on BIRD and 87.9% on GPT-4ocorrection.
Our approach outperforms a series of GPT-4 based Text-to-Seek systems when adopting DeepSeek (much cheaper) with same intact prompts.
arXiv Detail & Related papers (2024-10-31T16:22:26Z) - E-SQL: Direct Schema Linking via Question Enrichment in Text-to-SQL [1.187832944550453]
We introduce E-Seek, a novel pipeline specifically designed to address these challenges through direct schema linking and candidate predicate augmentation.<n>E-Seek enhances the natural language query by incorporating relevant database items (i.e., tables, columns, and values) and conditions directly into the question andsql construction plan, bridging the gap between the query and the database structure.<n> Comprehensive evaluations illustrate that E-Seek achieves competitive performance, particularly excelling in complex queries with a 66.29% execution accuracy on the test set.
arXiv Detail & Related papers (2024-09-25T09:02:48Z) - SelECT-SQL: Self-correcting ensemble Chain-of-Thought for Text-to-SQL [3.422309388045878]
We introduce SelECT-, a novel in-context learning solution that uses an algorithmic combination of chain-of-thought, self-correction, and ensemble methods.
Specifically, when configured using GPT as the base LLM, SelECT-Turbo achieves 84.2% execution accuracy on the Spider leaderboard's development set.
arXiv Detail & Related papers (2024-09-16T05:40:18Z) - Fine-Tuning Language Models for Context-Specific SQL Query Generation [0.0]
This paper presents a novel approach to fine-tuning open-source large language models (LLMs) for the task of transforming natural language intosql queries.
We introduce models specialized in generatingsql queries, trained on synthetic datasets tailored to the Snowflake SQL and Google dialects.
Our methodology involves generating a context-specific dataset using GPT-4, then fine-tuning three open-source LLMs(Starcoder Plus, Code-Llama, and Mistral) employing the LoRa technique to optimize for resource constraints.
The fine-tuned models demonstrate superior performance in zero-shot settings compared to the baseline GP
arXiv Detail & Related papers (2023-12-04T18:04:27Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - Wav2SQL: Direct Generalizable Speech-To-SQL Parsing [55.10009651476589]
Speech-to-Spider (S2Spider) aims to convert spoken questions intosql queries given databases.
We propose the first direct speech-to-speaker parsing model Wav2 which avoids error compounding across cascaded systems.
Experimental results demonstrate that Wav2 avoids error compounding and achieves state-of-the-art results by up to 2.5% accuracy improvement over the baseline.
arXiv Detail & Related papers (2023-05-21T19:26:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.