Related papers: Bad Characters: Imperceptible NLP Attacks

Bad Characters: Imperceptible NLP Attacks

URL: http://arxiv.org/abs/2106.09898v1
Date: Fri, 18 Jun 2021 03:42:56 GMT
Title: Bad Characters: Imperceptible NLP Attacks
Authors: Nicholas Boucher, Ilia Shumailov, Ross Anderson, Nicolas Papernot
Abstract summary: A class of adversarial examples can be used to attack text-based models in a black-box setting. We find that with a single imperceptible encoding injection an attacker can significantly reduce the performance of vulnerable models. Our attacks work against currently-deployed commercial systems, including those produced by Microsoft and Google.
Score: 16.357959724298745
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Several years of research have shown that machine-learning systems are vulnerable to adversarial examples, both in theory and in practice. Until now, such attacks have primarily targeted visual models, exploiting the gap between human and machine perception. Although text-based models have also been attacked with adversarial examples, such attacks struggled to preserve semantic meaning and indistinguishability. In this paper, we explore a large class of adversarial examples that can be used to attack text-based models in a black-box setting without making any human-perceptible visual modification to inputs. We use encoding-specific perturbations that are imperceptible to the human eye to manipulate the outputs of a wide range of Natural Language Processing (NLP) systems from neural machine-translation pipelines to web search engines. We find that with a single imperceptible encoding injection -- representing one invisible character, homoglyph, reordering, or deletion -- an attacker can significantly reduce the performance of vulnerable models, and with three injections most models can be functionally broken. Our attacks work against currently-deployed commercial systems, including those produced by Microsoft and Google, in addition to open source models published by Facebook and IBM. This novel series of attacks presents a significant threat to many language processing systems: an attacker can affect systems in a targeted manner without any assumptions about the underlying model. We conclude that text-based NLP systems require careful input sanitization, just like conventional applications, and that given such systems are now being deployed rapidly at scale, the urgent attention of architects and operators is required.

Related papers

Reformulation is All You Need: Addressing Malicious Text Features in DNNs [43.978490178352935]
We propose a unified and adaptive defense framework that is effective against both adversarial and backdoor attacks. Our framework outperforms existing sample-oriented defense baselines across a diverse range of malicious textual features.
arXiv Detail & Related papers (2025-02-02T03:39:43Z)
SoK: A Systems Perspective on Compound AI Threats and Countermeasures [3.458371054070399]
We discuss different software and hardware attacks applicable to compound AI systems. We show how combining multiple attack mechanisms can reduce the threat model assumptions required for an isolated attack.
arXiv Detail & Related papers (2024-11-20T17:08:38Z)
Undermining Image and Text Classification Algorithms Using Adversarial Attacks [0.0]
Our study addresses the gap by training various machine learning models and using GANs and SMOTE to generate additional data points aimed at attacking text classification models. Our experiments reveal a significant vulnerability in classification models. Specifically, we observe a 20 % decrease in accuracy for the top-performing text classification models post-attack, along with a 30 % decrease in facial recognition accuracy.
arXiv Detail & Related papers (2024-11-03T18:44:28Z)
MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware. We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph. This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z)
Query Efficient Decision Based Sparse Attacks Against Black-Box Deep Learning Models [9.93052896330371]
We develop an evolution-based algorithm-SparseEvo-for the problem and evaluate against both convolutional deep neural networks and vision transformers. SparseEvo requires significantly fewer model queries than the state-of-the-art sparse attack Pointwise for both untargeted and targeted attacks. Importantly, the query efficient SparseEvo, along with decision-based attacks, in general raise new questions regarding the safety of deployed systems.
arXiv Detail & Related papers (2022-01-31T21:10:47Z)
Real-World Adversarial Examples involving Makeup Application [58.731070632586594]
We propose a physical adversarial attack with the use of full-face makeup. Our attack can effectively overcome manual errors in makeup application, such as color and position-related errors.
arXiv Detail & Related papers (2021-09-04T05:29:28Z)
Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation. ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations. The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z)
Practical No-box Adversarial Attacks against DNNs [31.808770437120536]
We investigate no-box adversarial examples, where the attacker can neither access the model information or the training set nor query the model. We propose three mechanisms for training with a very small dataset and find that prototypical reconstruction is the most effective. Our approach significantly diminishes the average prediction accuracy of the system to only 15.40%, which is on par with the attack that transfers adversarial examples from a pre-trained Arcface model.
arXiv Detail & Related papers (2020-12-04T11:10:03Z)
Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples. We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z)
A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems. This paper proposes a self-supervised adversarial training mechanism in the input space. It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
Adversarial Machine Learning in Network Intrusion Detection Systems [6.18778092044887]
We study the nature of the adversarial problem in Network Intrusion Detection Systems. We use evolutionary computation (particle swarm optimization and genetic algorithm) and deep learning (generative adversarial networks) as tools for adversarial example generation. Our work highlights the vulnerability of machine learning based NIDS in the face of adversarial perturbation.
arXiv Detail & Related papers (2020-04-23T19:47:43Z)
Firearm Detection and Segmentation Using an Ensemble of Semantic Neural Networks [62.997667081978825]
We present a weapon detection system based on an ensemble of semantic Convolutional Neural Networks. A set of simpler neural networks dedicated to specific tasks requires less computational resources and can be trained in parallel. The overall output of the system given by the aggregation of the outputs of individual networks can be tuned by a user to trade-off false positives and false negatives.
arXiv Detail & Related papers (2020-02-11T13:58:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.