Are Synonym Substitution Attacks Really Synonym Substitution Attacks?
- URL: http://arxiv.org/abs/2210.02844v3
- Date: Mon, 8 May 2023 03:02:44 GMT
- Title: Are Synonym Substitution Attacks Really Synonym Substitution Attacks?
- Authors: Cheng-Han Chiang and Hung-yi Lee
- Abstract summary: We show that four widely used word substitution methods generate a large fraction of invalid substitution words that are ungrammatical or do not preserve the original sentence's semantics.
Next, we show that the semantic and grammatical constraints used in SSAs for detecting invalid word replacements are highly insufficient in detecting invalid adversarial samples.
- Score: 80.81532239566992
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we explore the following question: Are synonym substitution
attacks really synonym substitution attacks (SSAs)? We approach this question
by examining how SSAs replace words in the original sentence and show that
there are still unresolved obstacles that make current SSAs generate invalid
adversarial samples. We reveal that four widely used word substitution methods
generate a large fraction of invalid substitution words that are ungrammatical
or do not preserve the original sentence's semantics. Next, we show that the
semantic and grammatical constraints used in SSAs for detecting invalid word
replacements are highly insufficient in detecting invalid adversarial samples.
Related papers
- SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation [72.10931780019297]
Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design.
We propose SemStamp, a robust sentence-level semantic watermarking algorithm based on locality-sensitive hashing (LSH)
Experimental results show that our novel semantic watermark algorithm is not only more robust than the previous state-of-the-art method on both common and bigram paraphrase attacks, but also is better at preserving the quality of generation.
arXiv Detail & Related papers (2023-10-06T03:33:42Z) - Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks [39.51297217854375]
We propose Text-CRS, a certified robustness framework for natural language processing (NLP) based on randomized smoothing.
We show that Text-CRS can address all four different word-level adversarial operations and achieve a significant accuracy improvement.
We also provide the first benchmark on certified accuracy and radius of four word-level operations, besides outperforming the state-of-the-art certification against synonym substitution attacks.
arXiv Detail & Related papers (2023-07-31T13:08:16Z) - BOS at LSCDiscovery: Lexical Substitution for Interpretable Lexical
Semantic Change Detection [0.48733623015338234]
We propose a solution for the LSCDiscovery shared task on Lexical Semantic Change Detection in Spanish.
Our approach is based on generating lexical substitutes that describe old and new senses of a given word.
arXiv Detail & Related papers (2022-06-07T11:40:29Z) - Semantic-Preserving Adversarial Text Attacks [85.32186121859321]
We propose a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) method to examine the vulnerability of deep models.
Our method achieves the highest attack success rates and semantics rates by changing the smallest number of words compared with existing methods.
arXiv Detail & Related papers (2021-08-23T09:05:18Z) - Towards Robustness Against Natural Language Word Substitutions [87.56898475512703]
Robustness against word substitutions has a well-defined and widely acceptable form, using semantically similar words as substitutions.
Previous defense methods capture word substitutions in vector space by using either $l$-ball or hyper-rectangle.
arXiv Detail & Related papers (2021-07-28T17:55:08Z) - Swords: A Benchmark for Lexical Substitution with Improved Data Coverage
and Quality [126.55416118361495]
We release a new benchmark for lexical substitution, the task of finding appropriate substitutes for a target word in a context.
We use a context-free thesaurus to produce candidates and rely on human judgement to determine contextual appropriateness.
Compared to the previous largest benchmark, our Swords benchmark has 4.1x more substitutes per target word for the same level of quality, and its substitutes are 1.5x more appropriate (based on human judgement) for the same number of substitutes.
arXiv Detail & Related papers (2021-06-08T04:58:29Z) - Certified Robustness to Text Adversarial Attacks by Randomized [MASK] [39.07743913719665]
We propose a certifiably robust defense method by randomly masking a certain proportion of the words in an input text.
The proposed method can defend against not only word substitution-based attacks, but also character-level perturbations.
We can certify the classifications of over 50% texts to be robust to any perturbation of 5 words on AGNEWS, and 2 words on SST2 dataset.
arXiv Detail & Related papers (2021-05-08T16:59:10Z) - SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in
BERT-based Embedding Spaces [63.17308641484404]
We propose to identify clusters among different occurrences of each target word, considering these as representatives of different word meanings.
Disagreements in obtained clusters naturally allow to quantify the level of semantic shift per each target word in four target languages.
Our approach performs well both measured separately (per language) and overall, where we surpass all provided SemEval baselines.
arXiv Detail & Related papers (2020-10-02T08:38:40Z) - Reevaluating Adversarial Examples in Natural Language [20.14869834829091]
We analyze the outputs of two state-of-the-art synonym substitution attacks.
We find that their perturbations often do not preserve semantics, and 38% introduce grammatical errors.
With constraints adjusted to better preserve semantics and grammaticality, the attack success rate drops by over 70 percentage points.
arXiv Detail & Related papers (2020-04-25T03:09:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.