Towards Robustness of Text-to-SQL Models Against Natural and Realistic
Adversarial Table Perturbation
- URL: http://arxiv.org/abs/2212.09994v1
- Date: Tue, 20 Dec 2022 04:38:23 GMT
- Title: Towards Robustness of Text-to-SQL Models Against Natural and Realistic
Adversarial Table Perturbation
- Authors: Xinyu Pi, Bing Wang, Yan Gao, Jiaqi Guo, Zhoujun Li, Jian-Guang Lou
- Abstract summary: We introduce the Adversarial Table Perturbation (ATP) as a new attacking paradigm to measure the robustness of Text-to-textual models.
We build a systematic adversarial training example generation framework for better contextualization of data.
Experiments show that our approach not only brings the best improvement against table-side perturbations but also substantially empowers models against NL-side perturbations.
- Score: 38.00832631674398
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The robustness of Text-to-SQL parsers against adversarial perturbations plays
a crucial role in delivering highly reliable applications. Previous studies
along this line primarily focused on perturbations in the natural language
question side, neglecting the variability of tables. Motivated by this, we
propose the Adversarial Table Perturbation (ATP) as a new attacking paradigm to
measure the robustness of Text-to-SQL models. Following this proposition, we
curate ADVETA, the first robustness evaluation benchmark featuring natural and
realistic ATPs. All tested state-of-the-art models experience dramatic
performance drops on ADVETA, revealing models' vulnerability in real-world
practices. To defend against ATP, we build a systematic adversarial training
example generation framework tailored for better contextualization of tabular
data. Experiments show that our approach not only brings the best robustness
improvement against table-side perturbations but also substantially empowers
models against NL-side perturbations. We release our benchmark and code at:
https://github.com/microsoft/ContextualSP.
Related papers
- Adversarial Attacks on Tables with Entity Swap [22.32213001202441]
Adversarial attacks on text have been shown to greatly affect the performance of large language models (LLMs)
In this paper, we propose an evasive entity-swap attack for the column type annotation (CTA) task.
Our CTA attack is the first black-box attack on tables, where we employ a similarity-based sampling strategy to generate adversarial examples.
arXiv Detail & Related papers (2023-09-15T15:03:33Z) - On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model,
Data, and Training [109.9218185711916]
Aspect-based sentiment analysis (ABSA) aims at automatically inferring the specific sentiment polarities toward certain aspects of products or services behind social media texts or reviews.
We propose to enhance the ABSA robustness by systematically rethinking the bottlenecks from all possible angles, including model, data, and training.
arXiv Detail & Related papers (2023-04-19T11:07:43Z) - Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL
Robustness [115.66421993459663]
Recent studies reveal that text-to- models are vulnerable to task-specific perturbations.
We propose a comprehensive robustness benchmark based on Spider to diagnose the model.
We conduct a diagnostic study of the state-of-the-art models on the set.
arXiv Detail & Related papers (2023-01-21T03:57:18Z) - In and Out-of-Domain Text Adversarial Robustness via Label Smoothing [64.66809713499576]
We study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks.
Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks.
We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.
arXiv Detail & Related papers (2022-12-20T14:06:50Z) - SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN)
Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z) - Towards Robustness of Text-to-SQL Models against Synonym Substitution [15.047104267689052]
We introduce Spider-Syn, a dataset based on the Spider benchmark for text-to-world question translation.
We observe that the accuracy dramatically drops by eliminating explicit correspondence between NL questions and table schemas.
We present two categories of approaches to improve the model robustness.
arXiv Detail & Related papers (2021-06-02T10:36:23Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.