Rewriting Meaningful Sentences via Conditional BERT Sampling and an
application on fooling text classifiers
- URL: http://arxiv.org/abs/2010.11869v1
- Date: Thu, 22 Oct 2020 17:03:13 GMT
- Title: Rewriting Meaningful Sentences via Conditional BERT Sampling and an
application on fooling text classifiers
- Authors: Lei Xu, Ivan Ramirez, Kalyan Veeramachaneni
- Abstract summary: adversarial attack methods that are designed to deceive a text classifier change the text classifier's prediction by modifying a few words or characters.
Few try to attack classifiers by rewriting a whole sentence, due to the difficulties inherent in sentence-level rephrasing as well as the problem of setting the criteria for legitimate rewriting.
In this paper, we explore the problem of creating adversarial examples with sentence-level rewriting.
We propose a new criteria for modification, called a sentence-level threaten model. This criteria allows for both word- and sentence-level changes, and can be adjusted independently in two dimensions: semantic similarity and
- Score: 11.49508308643065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most adversarial attack methods that are designed to deceive a text
classifier change the text classifier's prediction by modifying a few words or
characters. Few try to attack classifiers by rewriting a whole sentence, due to
the difficulties inherent in sentence-level rephrasing as well as the problem
of setting the criteria for legitimate rewriting.
In this paper, we explore the problem of creating adversarial examples with
sentence-level rewriting. We design a new sampling method, named
ParaphraseSampler, to efficiently rewrite the original sentence in multiple
ways. Then we propose a new criteria for modification, called a sentence-level
threaten model. This criteria allows for both word- and sentence-level changes,
and can be adjusted independently in two dimensions: semantic similarity and
grammatical quality. Experimental results show that many of these rewritten
sentences are misclassified by the classifier. On all 6 datasets, our
ParaphraseSampler achieves a better attack success rate than our baseline.
Related papers
- Attacking Misinformation Detection Using Adversarial Examples Generated by Language Models [0.0]
We investigate the challenge of generating adversarial examples to test the robustness of text classification algorithms.
We focus on simulation of content moderation by setting realistic limits on the number of queries an attacker is allowed to attempt.
arXiv Detail & Related papers (2024-10-28T11:46:30Z) - Single Word Change is All You Need: Designing Attacks and Defenses for
Text Classifiers [12.167426402230229]
A significant portion of adversarial examples generated by existing methods change only one word.
This single-word perturbation vulnerability represents a significant weakness in classifiers.
We present the SP-Attack, designed to exploit the single-word perturbation vulnerability, achieving a higher attack success rate.
We also propose SP-Defense, which aims to improve rho by applying data augmentation in learning.
arXiv Detail & Related papers (2024-01-30T17:30:44Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Towards Document-Level Paraphrase Generation with Sentence Rewriting and
Reordering [88.08581016329398]
We propose CoRPG (Coherence Relationship guided Paraphrase Generation) for document-level paraphrase generation.
We use graph GRU to encode the coherence relationship graph and get the coherence-aware representation for each sentence.
Our model can generate document paraphrase with more diversity and semantic preservation.
arXiv Detail & Related papers (2021-09-15T05:53:40Z) - Semantic-Preserving Adversarial Text Attacks [85.32186121859321]
We propose a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) method to examine the vulnerability of deep models.
Our method achieves the highest attack success rates and semantics rates by changing the smallest number of words compared with existing methods.
arXiv Detail & Related papers (2021-08-23T09:05:18Z) - Improving Paraphrase Detection with the Adversarial Paraphrasing Task [0.0]
Paraphrasing datasets currently rely on a sense of paraphrase based on word overlap and syntax.
We introduce a new adversarial method of dataset creation for paraphrase identification: the Adversarial Paraphrasing Task (APT)
APT asks participants to generate semantically equivalent (in the sense of mutually implicative) but lexically and syntactically disparate paraphrases.
arXiv Detail & Related papers (2021-06-14T18:15:20Z) - Attacking Text Classifiers via Sentence Rewriting Sampler [12.25764838264699]
General sentence rewriting sampler (SRS) framework can conditionally generate meaningful sentences.
Our method can effectively rewrite the original sentence in multiple ways while maintaining high semantic similarity and good sentence quality.
Our method achieves a better attack success rate on 4 out of 7 datasets, as well as significantly better sentence quality on all 7 datasets.
arXiv Detail & Related papers (2021-04-17T05:21:35Z) - ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification
Models with Multiple Rewriting Transformations [97.27005783856285]
This paper introduces ASSET, a new dataset for assessing sentence simplification in English.
We show that simplifications in ASSET are better at capturing characteristics of simplicity when compared to other standard evaluation datasets for the task.
arXiv Detail & Related papers (2020-05-01T16:44:54Z) - Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence
Lip-Reading [96.48553941812366]
Lip-reading aims to infer the speech content from the lip movement sequence.
Traditional learning process of seq2seq models suffers from two problems.
We propose a novel pseudo-convolutional policy gradient (PCPG) based method to address these two problems.
arXiv Detail & Related papers (2020-03-09T09:12:26Z) - Fact-aware Sentence Split and Rephrase with Permutation Invariant
Training [93.66323661321113]
Sentence Split and Rephrase aims to break down a complex sentence into several simple sentences with its meaning preserved.
Previous studies tend to address the issue by seq2seq learning from parallel sentence pairs.
We introduce Permutation Training to verifies the effects of order variance in seq2seq learning for this task.
arXiv Detail & Related papers (2020-01-16T07:30:19Z) - Revisiting Paraphrase Question Generator using Pairwise Discriminator [25.449902612898594]
We propose a novel method for obtaining sentence-level embeddings.
The proposed method results in semantic embeddings and outperforms the state-of-the-art on the paraphrase generation and sentiment analysis task.
arXiv Detail & Related papers (2019-12-31T02:46:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.