ReEx-SQL: Reasoning with Execution-Aware Reinforcement Learning for Text-to-SQL
- URL: http://arxiv.org/abs/2505.12768v2
- Date: Tue, 20 May 2025 01:40:58 GMT
- Title: ReEx-SQL: Reasoning with Execution-Aware Reinforcement Learning for Text-to-SQL
- Authors: Yaxun Dai, Wenxuan Xie, Xialie Zhuang, Tianyu Yang, Yiying Yang, Haiqin Yang, Yuhang Zhao, Pingfu Chao, Wenhao Jiang,
- Abstract summary: Execution feedback is essential for guiding large language models to reason accurately and generate reliable queries.<n>Existing methods treat execution feedback solely as a post-hoc signal for correction or selection, failing to integrate it into the generation process.<n>We propose ReEx-Reasoning, a framework for Text-to-Reason that enables models to interact with the database during decoding and dynamically adjust their reasoning based on execution feedback.
- Score: 21.29343406227956
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Text-to-SQL, execution feedback is essential for guiding large language models (LLMs) to reason accurately and generate reliable SQL queries. However, existing methods treat execution feedback solely as a post-hoc signal for correction or selection, failing to integrate it into the generation process. This limitation hinders their ability to address reasoning errors as they occur, ultimately reducing query accuracy and robustness. To address this issue, we propose ReEx-SQL (Reasoning with Execution-Aware Reinforcement Learning), a framework for Text-to-SQL that enables models to interact with the database during decoding and dynamically adjust their reasoning based on execution feedback. ReEx-SQL introduces an execution-aware reasoning paradigm that interleaves intermediate SQL execution into reasoning paths, facilitating context-sensitive revisions. It achieves this through structured prompts with markup tags and a stepwise rollout strategy that integrates execution feedback into each stage of generation. To supervise policy learning, we develop a composite reward function that includes an exploration reward, explicitly encouraging effective database interaction. Additionally, ReEx-SQL adopts a tree-based decoding strategy to support exploratory reasoning, enabling dynamic expansion of alternative reasoning paths. Notably, ReEx-SQL achieves 88.8% on Spider and 64.9% on BIRD at the 7B scale, surpassing the standard reasoning baseline by 2.7% and 2.6%, respectively. It also shows robustness, achieving 85.2% on Spider-Realistic with leading performance. In addition, its tree-structured decoding improves efficiency and performance over linear decoding, reducing inference time by 51.9% on the BIRD development set.
Related papers
- CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation [1.169202600932732]
We introduce Cogni-R1-Zero, a reinforcement learning (RL) framework and model.<n>We use a lightweight reward signal based on execution correctness and format-tag compliance.<n>Our method achieves state-of-the-art execution accuracy on Text2 benchmark.<n>To support further research in efficient and interpretable Text-to-code modeling, we release two curated datasets.
arXiv Detail & Related papers (2025-07-08T14:17:07Z) - RAISE: Reasoning Agent for Interactive SQL Exploration [47.77323087050061]
We propose a novel framework that unifies schema linking, query generation, and iterative refinement within a single, end-to-end component.<n>Our method emulates how humans answer questions when working with unfamiliar databases.
arXiv Detail & Related papers (2025-06-02T03:07:08Z) - Reward-SQL: Boosting Text-to-SQL via Stepwise Reasoning and Process-Supervised Rewards [25.810871864483076]
External Process Reward Models (PRMs) can be introduced during training to provide fine-grained supervision.<n>We propose Reward-BIRD, a framework that explores how to incorporate PRMs into the Text-to-the- reasoning process effectively.
arXiv Detail & Related papers (2025-05-07T08:32:22Z) - ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback [49.21833666405111]
Large language models (LLMs) excel in many reasoning tasks, but their ability to leverage Chain-of-Thought (CoT) reasoning remains underexplored.<n>We propose ExCoT, a novel framework that iteratively optimize open-source LLMs by combining CoT reasoning with off-policy and on-policy DPO.
arXiv Detail & Related papers (2025-03-25T18:17:36Z) - STaR-SQL: Self-Taught Reasoner for Text-to-SQL [20.719165038519744]
"chain-of-thought" rationales have proven effective for improving the performance of large language models on complex reasoning tasks.<n>Applying such techniques to structured tasks, such as text-to-driven, remains largely unexplored.<n>In this paper, we introduce Self-Taughter for text-to-driven (STaR-), a novel approach that reframes query generation as a reasoning process.<n> Experimental results on the challenging Spider benchmark demonstrate that STaR- significantly improves text-to-performance, achieving an execution accuracy of 86.6%.<n>These findings underscore the potential of reasoning-augmented training for
arXiv Detail & Related papers (2025-02-19T08:58:44Z) - OpenSearch-SQL: Enhancing Text-to-SQL with Dynamic Few-shot and Consistency Alignment [6.2089733671434875]
We propose OpenSearch-, which divides the Text-to-agent task into four main modules: Preprocessing, Extraction, Generation, and Refinement, along with an Alignment module based on consistency alignment mechanism.<n>These methods have significantly improved the performance of LLMs in the Text-to-agent task.<n> Experimental results show that OpenSearch- achieves an execution accuracy(EX) of 69.3% on the BIRD development set, 72.28% on the test set, and a reward-based efficiency score (R-VES) of 69.3, with all three metrics ranking first at the time of submission.
arXiv Detail & Related papers (2025-02-19T07:51:50Z) - RSL-SQL: Robust Schema Linking in Text-to-SQL Generation [51.00761167842468]
We propose a novel framework called RSL- that combines bidirectional schema linking, contextual information augmentation, binary selection strategy, and multi-turn self-correction.
benchmarks demonstrate that our approach achieves SOTA execution accuracy among open-source solutions, with 67.2% on BIRD and 87.9% on GPT-4ocorrection.
Our approach outperforms a series of GPT-4 based Text-to-Seek systems when adopting DeepSeek (much cheaper) with same intact prompts.
arXiv Detail & Related papers (2024-10-31T16:22:26Z) - Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation [76.76046657162306]
Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
arXiv Detail & Related papers (2023-08-29T14:59:54Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic
Parsing [110.97778888305506]
BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question.
BRIDGE attained state-of-the-art performance on popular cross-DB text-to- relational benchmarks.
Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks.
arXiv Detail & Related papers (2020-12-23T12:33:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.