AmbiSQL: Interactive Ambiguity Detection and Resolution for Text-to-SQL
- URL: http://arxiv.org/abs/2508.15276v1
- Date: Thu, 21 Aug 2025 06:10:28 GMT
- Title: AmbiSQL: Interactive Ambiguity Detection and Resolution for Text-to-SQL
- Authors: Zhongjun Ding, Yin Lin, Tianjing Zeng,
- Abstract summary: We demonstrate Ambi, an interactive system that automatically detects query ambiguities and guides users through multiple-choice questions to clarify their intent.<n>Ambi achieves 87.2% in ambiguity detection and improves exact accuracy by 50% when integrated with Text-to- dataset systems.
- Score: 0.9217021281095907
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-SQL systems translate natural language questions into SQL queries, providing substantial value for non-expert users. While large language models (LLMs) show promising results for this task, they remain error-prone. Query ambiguity has been recognized as a major obstacle for LLM-based Text-to-SQL systems, leading to misinterpretation of user intent and inaccurate SQL generation. We demonstrate AmbiSQL, an interactive system that automatically detects query ambiguities and guides users through intuitive multiple-choice questions to clarify their intent. Our approach introduces a fine-grained ambiguity taxonomy for identifying ambiguities that affect database element mapping and LLM reasoning, then incorporates user feedback to rewrite ambiguous questions. Evaluation on an ambiguous query dataset shows that AmbiSQL achieves 87.2% precision in ambiguity detection and improves SQL exact match accuracy by 50% when integrated with Text-to-SQL systems. Our demonstration showcases the significant performance gains and highlights the system's practical usability. Code repo and demonstration are available at: https://github.com/JustinzjDing/AmbiSQL.
Related papers
- Query Carefully: Detecting the Unanswerables in Text-to-SQL Tasks [1.7781743265224403]
Text-to- systems allow non- experts to interact with databases using natural language.<n>Their tendency to generate executablesql for ambiguous, out-of-scope, or unanswerable queries introduces a hidden risk, as outputs may be misinterpreted as correct.<n>We present Query, a pipeline that integratessql generation with explicit ambiguity and handling of unanswerable inputs.
arXiv Detail & Related papers (2025-12-19T12:22:27Z) - SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL [20.93676525997898]
We propose an end-to-end framework for fine-grained detection and correction of semantic errors in large language models (LLMs) generated by text-to-the-box systems.<n>We show that our framework outperforms the best LLM-based self-evaluation method by 25.78% in F1 for error detection, and improves execution accuracy of out-of-the-box systems by up to 20%.
arXiv Detail & Related papers (2025-06-04T22:25:47Z) - PRACTIQ: A Practical Conversational Text-to-SQL dataset with Ambiguous and Unanswerable Queries [32.40808001281668]
Real user questions can often be ambiguous with multiple interpretations or unanswerable due to a lack of relevant data.
In this work, we construct a practical conversational text-to-text dataset.
We generate conversations with four turns: the initial user question, an assistant response seeking clarification, the user's clarification, and the assistant's clarified.
arXiv Detail & Related papers (2024-10-14T20:36:35Z) - E-SQL: Direct Schema Linking via Question Enrichment in Text-to-SQL [1.187832944550453]
We introduce E-Seek, a novel pipeline specifically designed to address these challenges through direct schema linking and candidate predicate augmentation.<n>E-Seek enhances the natural language query by incorporating relevant database items (i.e., tables, columns, and values) and conditions directly into the question andsql construction plan, bridging the gap between the query and the database structure.<n> Comprehensive evaluations illustrate that E-Seek achieves competitive performance, particularly excelling in complex queries with a 66.29% execution accuracy on the test set.
arXiv Detail & Related papers (2024-09-25T09:02:48Z) - AMBROSIA: A Benchmark for Parsing Ambiguous Questions into Database Queries [56.82807063333088]
We introduce a new benchmark, AMBROSIA, which we hope will inform and inspire the development of text-to-open programs.
Our dataset contains questions showcasing three different types of ambiguity (scope ambiguity, attachment ambiguity, and vagueness)
In each case, the ambiguity persists even when the database context is provided.
This is achieved through a novel approach that involves controlled generation of databases from scratch.
arXiv Detail & Related papers (2024-06-27T10:43:04Z) - SQLPrompt: In-Context Text-to-SQL with Minimal Labeled Data [54.69489315952524]
"Prompt" is designed to improve the few-shot prompting capabilities of Text-to-LLMs.
"Prompt" outperforms previous approaches for in-context learning with few labeled data by a large margin.
We show that emphPrompt outperforms previous approaches for in-context learning with few labeled data by a large margin.
arXiv Detail & Related papers (2023-11-06T05:24:06Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - Know What I don't Know: Handling Ambiguous and Unanswerable Questions
for Text-to-SQL [36.5089235153207]
Existing text-to-yourselfs generate a "plausible" query for an arbitrary user question.
We propose a simple yet effective generation approach that automatically produces ambiguous and unanswerable examples.
Experimental results show that our model achieves the best result on both real-world examples and generated examples.
arXiv Detail & Related papers (2022-12-17T15:32:00Z) - A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future
Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases.
Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z) - Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR.
Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries.
Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z) - Photon: A Robust Cross-Domain Text-to-SQL System [189.1405317853752]
We present Photon, a robust, modular, cross-domain NLIDB that can flag natural language input to which a mapping cannot be immediately determined.
The proposed method effectively improves the robustness of text-to-native system against untranslatable user input.
arXiv Detail & Related papers (2020-07-30T07:44:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.