Towards adversarial robustness with 01 loss neural networks
- URL: http://arxiv.org/abs/2008.09148v1
- Date: Thu, 20 Aug 2020 18:18:49 GMT
- Title: Towards adversarial robustness with 01 loss neural networks
- Authors: Yunzhe Xue, Meiyan Xie, Usman Roshan
- Abstract summary: We propose a hidden layer 01 loss neural network trained with convolutional coordinate descent as a defense against adversarial attacks in machine learning.
We compare the minimum distortion of the 01 loss network to the binarized neural network and the standard sigmoid activation network with cross-entropy loss.
Our work shows that the 01 loss network has the potential to defend against black box adversarial attacks better than convex loss and binarized networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motivated by the general robustness properties of the 01 loss we propose a
single hidden layer 01 loss neural network trained with stochastic coordinate
descent as a defense against adversarial attacks in machine learning. One
measure of a model's robustness is the minimum distortion required to make the
input adversarial. This can be approximated with the Boundary Attack (Brendel
et. al. 2018) and HopSkipJump (Chen et. al. 2019) methods. We compare the
minimum distortion of the 01 loss network to the binarized neural network and
the standard sigmoid activation network with cross-entropy loss all trained
with and without Gaussian noise on the CIFAR10 benchmark binary classification
between classes 0 and 1. Both with and without noise training we find our 01
loss network to have the largest adversarial distortion of the three models by
non-trivial margins. To further validate these results we subject all models to
substitute model black box attacks under different distortion thresholds and
find that the 01 loss network is the hardest to attack across all distortions.
At a distortion of 0.125 both sigmoid activated cross-entropy loss and
binarized networks have almost 0% accuracy on adversarial examples whereas the
01 loss network is at 40%. Even though both 01 loss and the binarized network
use sign activations their training algorithms are different which in turn give
different solutions for robustness. Finally we compare our network to simple
convolutional models under substitute model black box attacks and find their
accuracies to be comparable. Our work shows that the 01 loss network has the
potential to defend against black box adversarial attacks better than convex
loss and binarized networks.
Related papers
- Accuracy of TextFooler black box adversarial attacks on 01 loss sign
activation neural network ensemble [5.439020425819001]
Recent work has shown the defense of 01 loss sign activation neural networks against image classification adversarial attacks.
We ask the following question in this study: are 01 loss sign activation neural networks hard to deceive with a popular black box text adversarial attack program called TextFooler?
We find that our 01 loss sign activation network is much harder to attack with TextFooler compared to sigmoid activation cross entropy and binary neural networks.
arXiv Detail & Related papers (2024-02-12T00:36:34Z) - Two Heads are Better than One: Robust Learning Meets Multi-branch Models [14.72099568017039]
We propose Branch Orthogonality adveRsarial Training (BORT) to obtain state-of-the-art performance with solely the original dataset for adversarial training.
We evaluate our approach on CIFAR-10, CIFAR-100, and SVHN against ell_infty norm-bounded perturbations of size epsilon = 8/255, respectively.
arXiv Detail & Related papers (2022-08-17T05:42:59Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Robust Binary Models by Pruning Randomly-initialized Networks [57.03100916030444]
We propose ways to obtain robust models against adversarial attacks from randomly-d binary networks.
We learn the structure of the robust model by pruning a randomly-d binary network.
Our method confirms the strong lottery ticket hypothesis in the presence of adversarial attacks.
arXiv Detail & Related papers (2022-02-03T00:05:08Z) - Robustness Certificates for Implicit Neural Networks: A Mixed Monotone
Contractive Approach [60.67748036747221]
Implicit neural networks offer competitive performance and reduced memory consumption.
They can remain brittle with respect to input adversarial perturbations.
This paper proposes a theoretical and computational framework for robustness verification of implicit neural networks.
arXiv Detail & Related papers (2021-12-10T03:08:55Z) - Combating Adversaries with Anti-Adversaries [118.70141983415445]
In particular, our layer generates an input perturbation in the opposite direction of the adversarial one.
We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models.
Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.
arXiv Detail & Related papers (2021-03-26T09:36:59Z) - BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by
Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks.
Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z) - Defending against substitute model black box adversarial attacks with
the 01 loss [0.0]
We present 01 loss linear and 01 loss dual layer neural network models as a defense against substitute model black box attacks.
Our work shows that 01 loss models offer a powerful defense against substitute model black box attacks.
arXiv Detail & Related papers (2020-09-01T22:32:51Z) - On the transferability of adversarial examples between convex and 01
loss models [0.0]
We study transferability of adversarial examples between linear 01 loss and convex (hinge) loss models.
We show how the non-continuity of 01 loss makes adversaries non-transferable in a dual layer neural network.
We show that our dual layer sign activation network with 01 loss can attain robustness on par with simple convolutional networks.
arXiv Detail & Related papers (2020-06-14T04:51:45Z) - Feature Purification: How Adversarial Training Performs Robust Deep
Learning [66.05472746340142]
We show a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network.
We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly gradient descent indeed this principle.
arXiv Detail & Related papers (2020-05-20T16:56:08Z) - Robust binary classification with the 01 loss [0.0]
We develop a coordinate descent algorithm for a linear 01 loss and a single hidden layer 01 loss neural network.
We show our algorithms to be fast and comparable in accuracy to the linear support vector machine and logistic loss single hidden layer network for binary classification.
arXiv Detail & Related papers (2020-02-09T20:41:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.