Related papers: The Interpretability Analysis of the Model Can Bring Improvements to the Text-to-SQL Task

The Interpretability Analysis of the Model Can Bring Improvements to the Text-to-SQL Task

URL: http://arxiv.org/abs/2508.13178v1
Date: Tue, 12 Aug 2025 11:24:16 GMT
Title: The Interpretability Analysis of the Model Can Bring Improvements to the Text-to-SQL Task
Authors: Cong Zhang,
Abstract summary: We integrate model interpretability analysis with execution-guided strategy for semantic parsing of WHERE clauses.<n>Our model excels on the Wiki dataset, which is emblematic of single-table database query tasks.<n>Our hope is that this endeavor to enhance accuracy in processing basic database queries will offer fresh perspectives for research into handling complex queries.
Score: 3.890033714780255
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To elevate the foundational capabilities and generalization prowess of the text-to-SQL model in real-world applications, we integrate model interpretability analysis with execution-guided strategy for semantic parsing of WHERE clauses in SQL queries. Furthermore, we augment this approach with filtering adjustments, logical correlation refinements, and model fusion, culminating in the design of the CESQL model that facilitates conditional enhancement. Our model excels on the WikiSQL dataset, which is emblematic of single-table database query tasks, markedly boosting the accuracy of prediction outcomes. When predicting conditional values in WHERE clauses, we have not only minimized our dependence on data within the condition columns of tables but also circumvented the impact of manually labeled training data. Our hope is that this endeavor to enhance accuracy in processing basic database queries will offer fresh perspectives for research into handling complex queries and scenarios featuring irregular data in real-world database environments.

Related papers

APEX-SQL: Talking to the data via Agentic Exploration for Text-to-SQL [39.76924093980244]
APEX- verbalize is a framework that shifts the paradigm from passive translation to agentic exploration.<n>Our framework employs a hypothesis-verification loop to ground model reasoning in real data.
arXiv Detail & Related papers (2026-02-11T07:50:47Z)
Companion Agents: A Table-Information Mining Paradigm for Text-to-SQL [8.159121916366727]
Large-scale Text-to-curated benchmarks such as BIRD typically assume complete and accurate database annotations as well as available external knowledge.<n>This mismatch substantially limits the real-world applicability of state-of-the-domain Text-to-art systems.<n>We propose a database-centric approach that leverages intrinsic, fine-grained information residing in relational databases to construct missing evidence.
arXiv Detail & Related papers (2025-12-17T07:11:55Z)
Text-to-SQL as Dual-State Reasoning: Integrating Adaptive Context and Progressive Generation [54.53145282349042]
We introduce DSR-sourced, a textbfDual-textbfS textbfReasoning framework that models Text-to-context as an interaction between an adaptive context state and a progressive generation state.<n>Without any post-training or in-context examples, DSR-sourced achieves competitive performance, reaching 35.28% execution accuracy on Spider 2.0-Snow and 68.32% on BIRD development set.
arXiv Detail & Related papers (2025-11-26T13:52:50Z)
Same Content, Different Representations: A Controlled Study for Table QA [15.896655757672441]
Table Question Answering (Table QA) in real-world settings must operate over both structured databases and semi-structured tables containing textual fields.<n>Existing benchmarks are tied to fixed data formats and have not systematically examined how representation itself affects model performance.<n>We present the first controlled study that isolates the role of table representation by holding content constant while varying structure.
arXiv Detail & Related papers (2025-09-26T22:33:19Z)
Sparks of Tabular Reasoning via Text2SQL Reinforcement Learning [0.12289361708127876]
This work reframes the Text-to-the-task as a pathway for teaching large language models (LLMs) to reason over and manipulate data.<n>We propose a two-stage framework that teaches a model how to traverse, filter, and aggregate table fields.<n> Empirically, our approach achieves substantial gains on reasoning-intensive datasets such as BIRD and CRT-QA.
arXiv Detail & Related papers (2025-04-23T19:02:04Z)
Rationalization Models for Text-to-SQL [13.792561265515003]
We introduce a framework for generating Chain-of-Thought (CoT) rationales to enhance text-to-thought model fine-tuning.<n>The process begins with manually annotating a small set of examples, which are then used to prompt a large language model.<n>A rationalization model is subsequently trained on the validated queries, enabling extensive synthetic CoT annotations.
arXiv Detail & Related papers (2025-02-10T18:38:57Z)
Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning [10.731045939849125]
We focus on Text-to- semantic parsing from the perspective of retrieval-augmented generation. Motivated by challenges related to the size of commercial database schemata and the deployability of business intelligence solutions, we propose $textASTReS$ that dynamically retrieves input database information.
arXiv Detail & Related papers (2024-07-03T15:55:14Z)
UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics. We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z)
CHESS: Contextual Harnessing for Efficient SQL Synthesis [1.9506402593665235]
We introduce CHESS, a framework for efficient and scalable text-to- queries. It comprises four specialized agents, each targeting one of the aforementioned challenges. Our framework offers features that adapt to various deployment constraints.
arXiv Detail & Related papers (2024-05-27T01:54:16Z)
SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs) With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses. With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z)
Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding. Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z)
Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing [66.55478402233399]
We propose a framework to elicit relational structures via a probing procedure based on Poincar'e distance metric. Compared with commonly-used rule-based methods for schema linking, we found that probing relations can robustly capture semantic correspondences. Our framework sets new state-of-the-art performance on three benchmarks.
arXiv Detail & Related papers (2022-06-28T14:05:25Z)
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing. We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar. To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.