A Relaxed Optimization Approach for Adversarial Attacks against Neural
Machine Translation Models
- URL: http://arxiv.org/abs/2306.08492v1
- Date: Wed, 14 Jun 2023 13:13:34 GMT
- Title: A Relaxed Optimization Approach for Adversarial Attacks against Neural
Machine Translation Models
- Authors: Sahar Sadrizadeh, Cl\'ement Barbier, Ljiljana Dolamic, Pascal Frossard
- Abstract summary: We propose an optimization-based adversarial attack against Neural Machine Translation (NMT) models.
Experimental results show that our attack significantly degrades the translation quality of multiple NMT models.
Our attack outperforms the baselines in terms of success rate, similarity preservation, effect on translation quality, and token error rate.
- Score: 44.04452616807661
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose an optimization-based adversarial attack against
Neural Machine Translation (NMT) models. First, we propose an optimization
problem to generate adversarial examples that are semantically similar to the
original sentences but destroy the translation generated by the target NMT
model. This optimization problem is discrete, and we propose a continuous
relaxation to solve it. With this relaxation, we find a probability
distribution for each token in the adversarial example, and then we can
generate multiple adversarial examples by sampling from these distributions.
Experimental results show that our attack significantly degrades the
translation quality of multiple NMT models while maintaining the semantic
similarity between the original and adversarial sentences. Furthermore, our
attack outperforms the baselines in terms of success rate, similarity
preservation, effect on translation quality, and token error rate. Finally, we
propose a black-box extension of our attack by sampling from an optimized
probability distribution for a reference model whose gradients are accessible.
Related papers
- A Classification-Guided Approach for Adversarial Attacks against Neural
Machine Translation [66.58025084857556]
We introduce ACT, a novel adversarial attack framework against NMT systems guided by a classifier.
In our attack, the adversary aims to craft meaning-preserving adversarial examples whose translations belong to a different class than the original translations.
To evaluate the robustness of NMT models to our attack, we propose enhancements to existing black-box word-replacement-based attacks.
arXiv Detail & Related papers (2023-08-29T12:12:53Z) - Boosting Adversarial Transferability by Achieving Flat Local Maxima [23.91315978193527]
Recently, various adversarial attacks have emerged to boost adversarial transferability from different perspectives.
In this work, we assume and empirically validate that adversarial examples at a flat local region tend to have good transferability.
We propose an approximation optimization method to simplify the gradient update of the objective function.
arXiv Detail & Related papers (2023-06-08T14:21:02Z) - Making Substitute Models More Bayesian Can Enhance Transferability of
Adversarial Examples [89.85593878754571]
transferability of adversarial examples across deep neural networks is the crux of many black-box attacks.
We advocate to attack a Bayesian model for achieving desirable transferability.
Our method outperforms recent state-of-the-arts by large margins.
arXiv Detail & Related papers (2023-02-10T07:08:13Z) - TransFool: An Adversarial Attack against Neural Machine Translation
Models [49.50163349643615]
We investigate the vulnerability of Neural Machine Translation (NMT) models to adversarial attacks and propose a new attack algorithm called TransFool.
We generate fluent adversarial examples in the source language that maintain a high level of semantic similarity with the clean samples.
Based on automatic and human evaluations, TransFool leads to improvement in terms of success rate, semantic similarity, and fluency compared to the existing attacks.
arXiv Detail & Related papers (2023-02-02T08:35:34Z) - Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning [24.10329164911317]
We propose an approach named Multiple Asymptotically Normal Distribution Attacks (MultiANDA)
We approximate the posterior distribution over the perturbations by taking advantage of the normality property of gradient ascent (SGA)
Our proposed method outperforms ten state-of-the-art black-box attacks on deep learning models with or without defenses.
arXiv Detail & Related papers (2022-09-24T08:57:10Z) - Generating Authentic Adversarial Examples beyond Meaning-preserving with
Doubly Round-trip Translation [64.16077929617119]
We propose a new criterion for NMT adversarial examples based on the Doubly Round-Trip Translation (DRTT)
To enhance the robustness of the NMT model, we introduce the masked language models to construct bilingual adversarial pairs.
arXiv Detail & Related papers (2022-04-19T06:15:27Z) - Doubly-Trained Adversarial Data Augmentation for Neural Machine
Translation [8.822338727711715]
We generate adversarial augmentation samples that attack the model and preserve the source-side semantic meaning.
The results from our experiments show that these adversarial samples improve the model robustness.
arXiv Detail & Related papers (2021-10-12T02:23:00Z) - BOSS: Bidirectional One-Shot Synthesis of Adversarial Examples [8.359029046999233]
A one-shot synthesis of adversarial examples is proposed in this paper.
The inputs are synthesized from scratch to induce arbitrary soft predictions at the output of pre-trained models.
We demonstrate the generality and versatility of the framework and approach proposed through applications to the design of targeted adversarial attacks.
arXiv Detail & Related papers (2021-08-05T17:43:36Z) - Gradient-based Adversarial Attacks against Text Transformers [96.73493433809419]
We propose the first general-purpose gradient-based attack against transformer models.
We empirically demonstrate that our white-box attack attains state-of-the-art attack performance on a variety of natural language tasks.
arXiv Detail & Related papers (2021-04-15T17:43:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.