MT-Teql: Evaluating and Augmenting Consistency of Text-to-SQL Models
with Metamorphic Testing
- URL: http://arxiv.org/abs/2012.11163v1
- Date: Mon, 21 Dec 2020 07:43:31 GMT
- Title: MT-Teql: Evaluating and Augmenting Consistency of Text-to-SQL Models
with Metamorphic Testing
- Authors: Pingchuan Ma and Shuai Wang
- Abstract summary: We propose MT-Teql, a Metamorphic Testing-based framework for evaluating and augmenting the consistency of text-to-preserving models.
Our framework exposes thousands of prediction errors from SOTA models and enriches existing datasets by order of magnitude, eliminating over 40% inconsistency errors without compromising standard accuracy.
- Score: 11.566463879334862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-SQL is a task to generate SQL queries from human utterances. However,
due to the variation of natural language, two semantically equivalent
utterances may appear differently in the lexical level. Likewise, user
preferences (e.g., the choice of normal forms) can lead to dramatic changes in
table structures when expressing conceptually identical schemas. Envisioning
the general difficulty for text-to-SQL models to preserve prediction
consistency against linguistic and schema variations, we propose MT-Teql, a
Metamorphic Testing-based framework for systematically evaluating and
augmenting the consistency of TExt-to-SQL models. Inspired by the principles of
software metamorphic testing, MT-Teql delivers a model-agnostic framework which
implements a comprehensive set of metamorphic relations (MRs) to conduct
semantics-preserving transformations toward utterances and schemas. Model
Inconsistency can be exposed when the original and transformed inputs induce
different SQL queries. In addition, we leverage the transformed inputs to
retrain models for further model robustness boost. Our experiments show that
our framework exposes thousands of prediction errors from SOTA models and
enriches existing datasets by order of magnitude, eliminating over 40%
inconsistency errors without compromising standard accuracy.
Related papers
- Exploring the Compositional Generalization in Context Dependent
Text-to-SQL Parsing [14.644212594593919]
This work is the first exploration of compositional generalization in context-dependent Text-to-the-scenarios.
Experiments show that all current models struggle on our proposed benchmarks.
We propose a method named textttp-align to improve the compositional generalization of Text-to-the-scenarios.
arXiv Detail & Related papers (2023-05-29T12:36:56Z) - Conversational Text-to-SQL: An Odyssey into State-of-the-Art and
Challenges Ahead [6.966624873109535]
State-of-the-art (SOTA) systems use large, pre-trained and finetuned language models, such as the T5-family.
With multi-tasking (MT) over coherent tasks with discrete prompts during training, we improve over specialized text-to-three models.
We conduct studies to tease apart errors attributable to domain and compositional generalization.
arXiv Detail & Related papers (2023-02-21T23:15:33Z) - Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL
Robustness [115.66421993459663]
Recent studies reveal that text-to- models are vulnerable to task-specific perturbations.
We propose a comprehensive robustness benchmark based on Spider to diagnose the model.
We conduct a diagnostic study of the state-of-the-art models on the set.
arXiv Detail & Related papers (2023-01-21T03:57:18Z) - Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding.
Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z) - SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN)
Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z) - Towards Robustness of Text-to-SQL Models against Synonym Substitution [15.047104267689052]
We introduce Spider-Syn, a dataset based on the Spider benchmark for text-to-world question translation.
We observe that the accuracy dramatically drops by eliminating explicit correspondence between NL questions and table schemas.
We present two categories of approaches to improve the model robustness.
arXiv Detail & Related papers (2021-06-02T10:36:23Z) - Learning Contextual Representations for Semantic Parsing with
Generation-Augmented Pre-Training [86.91380874390778]
We present Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data.
Based on experimental results, neural semantics that leverage GAP MODEL obtain new state-of-the-art results on both SPIDER and CRITERIA-TO-generative benchmarks.
arXiv Detail & Related papers (2020-12-18T15:53:50Z) - Explicitly Modeling Syntax in Language Models with Incremental Parsing
and a Dynamic Oracle [88.65264818967489]
We propose a new syntax-aware language model: Syntactic Ordered Memory (SOM)
The model explicitly models the structure with an incremental and maintains the conditional probability setting of a standard language model.
Experiments show that SOM can achieve strong results in language modeling, incremental parsing and syntactic generalization tests.
arXiv Detail & Related papers (2020-10-21T17:39:15Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.