Error Detection for Text-to-SQL Semantic Parsing
- URL: http://arxiv.org/abs/2305.13683v2
- Date: Wed, 6 Dec 2023 14:09:51 GMT
- Title: Error Detection for Text-to-SQL Semantic Parsing
- Authors: Shijie Chen, Ziru Chen, Huan Sun, Yu Su
- Abstract summary: Modern text-to- semantics are often over-confident, casting doubt on their trustworthiness when deployed for real use.
We propose a-independent error detection model for text-to- semantic parsing.
- Score: 18.068244400731366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite remarkable progress in text-to-SQL semantic parsing in recent years,
the performance of existing parsers is still far from perfect. Specifically,
modern text-to-SQL parsers based on deep learning are often over-confident,
thus casting doubt on their trustworthiness when deployed for real use. In this
paper, we propose a parser-independent error detection model for text-to-SQL
semantic parsing. Using a language model of code as its bedrock, we enhance our
error detection model with graph neural networks that learn structural features
of both natural language questions and SQL queries. We train our model on
realistic parsing errors collected from a cross-domain setting, which leads to
stronger generalization ability. Experiments with three strong text-to-SQL
parsers featuring different decoding mechanisms show that our approach
outperforms parser-dependent uncertainty metrics. Our model could also
effectively improve the performance and usability of text-to-SQL semantic
parsers regardless of their architectures. (Our implementation is available at
https://github.com/OSU-NLP-Group/Text2SQL-Error-Detection)
Related papers
- TrustSQL: Benchmarking Text-to-SQL Reliability with Penalty-Based Scoring [11.78795632771211]
We introduce a novel benchmark designed to evaluate text-to- reliability as a model's ability to correctly handle any type of input question.
We evaluate existing methods using a novel penalty-based scoring metric with two modeling approaches.
arXiv Detail & Related papers (2024-03-23T16:12:52Z) - UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems.
It is composed of publicly available text-to-domain datasets and 29K databases.
Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z) - Text-to-SQL Error Correction with Language Models of Code [24.743066730684742]
In this paper, we investigate how to build automatic text-to-corpora error correction models.
Noticing that token-level edits are out of context and sometimes ambiguous, we propose building clause-level edit models instead.
arXiv Detail & Related papers (2023-05-22T14:42:39Z) - Towards Generalizable and Robust Text-to-SQL Parsing [77.18724939989647]
We propose a novel TKK framework consisting of Task decomposition, Knowledge acquisition, and Knowledge composition to learn text-to- parsing in stages.
We show that our framework is effective in all scenarios and state-of-the-art performance on the Spider, SParC, and Co. datasets.
arXiv Detail & Related papers (2022-10-23T09:21:27Z) - SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN)
Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z) - A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future
Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases.
Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z) - Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR.
Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries.
Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z) - Photon: A Robust Cross-Domain Text-to-SQL System [189.1405317853752]
We present Photon, a robust, modular, cross-domain NLIDB that can flag natural language input to which a mapping cannot be immediately determined.
The proposed method effectively improves the robustness of text-to-native system against untranslatable user input.
arXiv Detail & Related papers (2020-07-30T07:44:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.