Bad Characters: Imperceptible NLP Attacks
- URL: http://arxiv.org/abs/2106.09898v1
- Date: Fri, 18 Jun 2021 03:42:56 GMT
- Title: Bad Characters: Imperceptible NLP Attacks
- Authors: Nicholas Boucher, Ilia Shumailov, Ross Anderson, Nicolas Papernot
- Abstract summary: A class of adversarial examples can be used to attack text-based models in a black-box setting.
We find that with a single imperceptible encoding injection an attacker can significantly reduce the performance of vulnerable models.
Our attacks work against currently-deployed commercial systems, including those produced by Microsoft and Google.
- Score: 16.357959724298745
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Several years of research have shown that machine-learning systems are
vulnerable to adversarial examples, both in theory and in practice. Until now,
such attacks have primarily targeted visual models, exploiting the gap between
human and machine perception. Although text-based models have also been
attacked with adversarial examples, such attacks struggled to preserve semantic
meaning and indistinguishability. In this paper, we explore a large class of
adversarial examples that can be used to attack text-based models in a
black-box setting without making any human-perceptible visual modification to
inputs. We use encoding-specific perturbations that are imperceptible to the
human eye to manipulate the outputs of a wide range of Natural Language
Processing (NLP) systems from neural machine-translation pipelines to web
search engines. We find that with a single imperceptible encoding injection --
representing one invisible character, homoglyph, reordering, or deletion -- an
attacker can significantly reduce the performance of vulnerable models, and
with three injections most models can be functionally broken. Our attacks work
against currently-deployed commercial systems, including those produced by
Microsoft and Google, in addition to open source models published by Facebook
and IBM. This novel series of attacks presents a significant threat to many
language processing systems: an attacker can affect systems in a targeted
manner without any assumptions about the underlying model. We conclude that
text-based NLP systems require careful input sanitization, just like
conventional applications, and that given such systems are now being deployed
rapidly at scale, the urgent attention of architects and operators is required.
Related papers
- Undermining Image and Text Classification Algorithms Using Adversarial Attacks [0.0]
Our study addresses the gap by training various machine learning models and using GANs and SMOTE to generate additional data points aimed at attacking text classification models.
Our experiments reveal a significant vulnerability in classification models. Specifically, we observe a 20 % decrease in accuracy for the top-performing text classification models post-attack, along with a 30 % decrease in facial recognition accuracy.
arXiv Detail & Related papers (2024-11-03T18:44:28Z) - MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware.
We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph.
This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z) - Query Efficient Decision Based Sparse Attacks Against Black-Box Deep
Learning Models [9.93052896330371]
We develop an evolution-based algorithm-SparseEvo-for the problem and evaluate against both convolutional deep neural networks and vision transformers.
SparseEvo requires significantly fewer model queries than the state-of-the-art sparse attack Pointwise for both untargeted and targeted attacks.
Importantly, the query efficient SparseEvo, along with decision-based attacks, in general raise new questions regarding the safety of deployed systems.
arXiv Detail & Related papers (2022-01-31T21:10:47Z) - Real-World Adversarial Examples involving Makeup Application [58.731070632586594]
We propose a physical adversarial attack with the use of full-face makeup.
Our attack can effectively overcome manual errors in makeup application, such as color and position-related errors.
arXiv Detail & Related papers (2021-09-04T05:29:28Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Practical No-box Adversarial Attacks against DNNs [31.808770437120536]
We investigate no-box adversarial examples, where the attacker can neither access the model information or the training set nor query the model.
We propose three mechanisms for training with a very small dataset and find that prototypical reconstruction is the most effective.
Our approach significantly diminishes the average prediction accuracy of the system to only 15.40%, which is on par with the attack that transfers adversarial examples from a pre-trained Arcface model.
arXiv Detail & Related papers (2020-12-04T11:10:03Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Exploring the role of Input and Output Layers of a Deep Neural Network
in Adversarial Defense [0.0]
It has been shown that certain inputs exist which would not trick a human normally, but may mislead the model completely.
adversarial inputs pose a high security threat when such models are used in real world applications.
We have analyzed the resistance of three different classes of fully connected dense networks against the rarely tested non-gradient based adversarial attacks.
arXiv Detail & Related papers (2020-06-02T06:15:46Z) - Adversarial Machine Learning in Network Intrusion Detection Systems [6.18778092044887]
We study the nature of the adversarial problem in Network Intrusion Detection Systems.
We use evolutionary computation (particle swarm optimization and genetic algorithm) and deep learning (generative adversarial networks) as tools for adversarial example generation.
Our work highlights the vulnerability of machine learning based NIDS in the face of adversarial perturbation.
arXiv Detail & Related papers (2020-04-23T19:47:43Z) - Firearm Detection and Segmentation Using an Ensemble of Semantic Neural
Networks [62.997667081978825]
We present a weapon detection system based on an ensemble of semantic Convolutional Neural Networks.
A set of simpler neural networks dedicated to specific tasks requires less computational resources and can be trained in parallel.
The overall output of the system given by the aggregation of the outputs of individual networks can be tuned by a user to trade-off false positives and false negatives.
arXiv Detail & Related papers (2020-02-11T13:58:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.