Defending against substitute model black box adversarial attacks with
the 01 loss
- URL: http://arxiv.org/abs/2009.09803v1
- Date: Tue, 1 Sep 2020 22:32:51 GMT
- Title: Defending against substitute model black box adversarial attacks with
the 01 loss
- Authors: Yunzhe Xue, Meiyan Xie, Usman Roshan
- Abstract summary: We present 01 loss linear and 01 loss dual layer neural network models as a defense against substitute model black box attacks.
Our work shows that 01 loss models offer a powerful defense against substitute model black box attacks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Substitute model black box attacks can create adversarial examples for a
target model just by accessing its output labels. This poses a major challenge
to machine learning models in practice, particularly in security sensitive
applications. The 01 loss model is known to be more robust to outliers and
noise than convex models that are typically used in practice. Motivated by
these properties we present 01 loss linear and 01 loss dual layer neural
network models as a defense against transfer based substitute model black box
attacks. We compare the accuracy of adversarial examples from substitute model
black box attacks targeting our 01 loss models and their convex counterparts
for binary classification on popular image benchmarks. Our 01 loss dual layer
neural network has an adversarial accuracy of 66.2%, 58%, 60.5%, and 57% on
MNIST, CIFAR10, STL10, and ImageNet respectively whereas the sigmoid activated
logistic loss counterpart has accuracies of 63.5%, 19.3%, 14.9%, and 27.6%.
Except for MNIST the convex counterparts have substantially lower adversarial
accuracies. We show practical applications of our models to deter traffic sign
and facial recognition adversarial attacks. On GTSRB street sign and CelebA
facial detection our 01 loss network has 34.6% and 37.1% adversarial accuracy
respectively whereas the convex logistic counterpart has accuracy 24% and 1.9%.
Finally we show that our 01 loss network can attain robustness on par with
simple convolutional neural networks and much higher than its convex
counterpart even when attacked with a convolutional network substitute model.
Our work shows that 01 loss models offer a powerful defense against substitute
model black box attacks.
Related papers
- Defense Against Model Extraction Attacks on Recommender Systems [53.127820987326295]
We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems.
GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model.
Results show GRO's superior effectiveness in defending against model extraction attacks.
arXiv Detail & Related papers (2023-10-25T03:30:42Z) - Fault Injection and Safe-Error Attack for Extraction of Embedded Neural Network Models [1.2499537119440245]
We focus on embedded deep neural network models on 32-bit microcontrollers in the Internet of Things (IoT)
We propose a black-box approach to craft a successful attack set.
For a classical convolutional neural network, we successfully recover at least 90% of the most significant bits with about 1500 crafted inputs.
arXiv Detail & Related papers (2023-08-31T13:09:33Z) - Practical No-box Adversarial Attacks with Training-free Hybrid Image Transformation [94.30136898739448]
We show the existence of a textbftraining-free adversarial perturbation under the no-box threat model.
Motivated by our observation that high-frequency component (HFC) domains in low-level features, we attack an image mainly by manipulating its frequency components.
Our method is even competitive to mainstream transfer-based black-box attacks.
arXiv Detail & Related papers (2022-03-09T09:51:00Z) - Towards Adversarial Patch Analysis and Certified Defense against Crowd
Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems.
Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks.
We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z) - Generating Unrestricted Adversarial Examples via Three Parameters [11.325135016306165]
A proposed adversarial attack generates an unrestricted adversarial example with a limited number of parameters.
It obtains an average success rate of 93.5% in terms of human evaluation on the MNIST and SVHN datasets.
It also reduces the model accuracy by an average of 73% on six datasets.
arXiv Detail & Related papers (2021-03-13T07:20:14Z) - Defence against adversarial attacks using classical and quantum-enhanced
Boltzmann machines [64.62510681492994]
generative models attempt to learn the distribution underlying a dataset, making them inherently more robust to small perturbations.
We find improvements ranging from 5% to 72% against attacks with Boltzmann machines on the MNIST dataset.
arXiv Detail & Related papers (2020-12-21T19:00:03Z) - Towards adversarial robustness with 01 loss neural networks [0.0]
We propose a hidden layer 01 loss neural network trained with convolutional coordinate descent as a defense against adversarial attacks in machine learning.
We compare the minimum distortion of the 01 loss network to the binarized neural network and the standard sigmoid activation network with cross-entropy loss.
Our work shows that the 01 loss network has the potential to defend against black box adversarial attacks better than convex loss and binarized networks.
arXiv Detail & Related papers (2020-08-20T18:18:49Z) - Perceptual Adversarial Robustness: Defense Against Unseen Threat Models [58.47179090632039]
A key challenge in adversarial robustness is the lack of a precise mathematical characterization of human perception.
Under the neural perceptual threat model, we develop novel perceptual adversarial attacks and defenses.
Because the NPTM is very broad, we find that Perceptual Adrial Training (PAT) against a perceptual attack gives robustness against many other types of adversarial attacks.
arXiv Detail & Related papers (2020-06-22T22:40:46Z) - On the transferability of adversarial examples between convex and 01
loss models [0.0]
We study transferability of adversarial examples between linear 01 loss and convex (hinge) loss models.
We show how the non-continuity of 01 loss makes adversaries non-transferable in a dual layer neural network.
We show that our dual layer sign activation network with 01 loss can attain robustness on par with simple convolutional networks.
arXiv Detail & Related papers (2020-06-14T04:51:45Z) - DaST: Data-free Substitute Training for Adversarial Attacks [55.76371274622313]
We propose a data-free substitute training method (DaST) to obtain substitute models for adversarial black-box attacks.
To achieve this, DaST utilizes specially designed generative adversarial networks (GANs) to train the substitute models.
Experiments demonstrate the substitute models can achieve competitive performance compared with the baseline models.
arXiv Detail & Related papers (2020-03-28T04:28:13Z) - Robust binary classification with the 01 loss [0.0]
We develop a coordinate descent algorithm for a linear 01 loss and a single hidden layer 01 loss neural network.
We show our algorithms to be fast and comparable in accuracy to the linear support vector machine and logistic loss single hidden layer network for binary classification.
arXiv Detail & Related papers (2020-02-09T20:41:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.