Related papers: Efficient and Robust Classification for Sparse Attacks

Efficient and Robust Classification for Sparse Attacks

URL: http://arxiv.org/abs/2201.09369v1
Date: Sun, 23 Jan 2022 21:18:17 GMT
Title: Efficient and Robust Classification for Sparse Attacks
Authors: Mark Beliaev, Payam Delgosha, Hamed Hassani, Ramtin Pedarsani
Abstract summary: We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection. We propose a novel defense method that consists of "truncation" and "adrial training" Motivated by the insights we obtain, we extend these components to neural network classifiers.
Score: 34.48667992227529
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the past two decades we have seen the popularity of neural networks increase in conjunction with their classification accuracy. Parallel to this, we have also witnessed how fragile the very same prediction models are: tiny perturbations to the inputs can cause misclassification errors throughout entire datasets. In this paper, we consider perturbations bounded by the $\ell_0$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection. To this end, we propose a novel defense method that consists of "truncation" and "adversarial training". We then theoretically study the Gaussian mixture setting and prove the asymptotic optimality of our proposed classifier. Motivated by the insights we obtain, we extend these components to neural network classifiers. We conduct numerical experiments in the domain of computer vision using the MNIST and CIFAR datasets, demonstrating significant improvement for the robust classification error of neural networks.

Related papers

MOREL: Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
deep neural networks (DNNs) are vulnerable to slight adversarial perturbations. We show that strong feature representation learning during training can significantly enhance the original model's robustness. We propose MOREL, a multi-objective feature representation learning approach, encouraging classification models to produce similar features for inputs within the same class, despite perturbations.
arXiv Detail & Related papers (2024-10-02T16:05:03Z)
SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract. We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z)
Sparse and Transferable Universal Singular Vectors Attack [5.498495800909073]
We propose a novel sparse universal white-box adversarial attack. Our approach is based on truncated power providing sparsity to $(p,q)$-singular vectors of the hidden layers of Jacobian matrices. Our findings demonstrate the vulnerability of state-of-the-art models to sparse attacks and highlight the importance of developing robust machine learning systems.
arXiv Detail & Related papers (2024-01-25T09:21:29Z)
Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
Unfolding Local Growth Rate Estimates for (Almost) Perfect Adversarial Detection [22.99930028876662]
Convolutional neural networks (CNN) define the state-of-the-art solution on many perceptual tasks. Current CNN approaches largely remain vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system. We propose a simple and light-weight detector, which leverages recent findings on the relation between networks' local intrinsic dimensionality (LID) and adversarial attacks.
arXiv Detail & Related papers (2022-12-13T17:51:32Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier [61.063988689601416]
Pre-trained models are widely used in fine-tuning downstream tasks with linear classifiers optimized by the cross-entropy loss. These problems can be improved by learning representations that focus on similarities in the same class and contradictions when making predictions. We introduce the KNearest Neighbors in pre-trained model fine-tuning tasks in this paper.
arXiv Detail & Related papers (2021-10-06T06:17:05Z)
An Orthogonal Classifier for Improving the Adversarial Robustness of Neural Networks [21.13588742648554]
Recent efforts have shown that imposing certain modifications on classification layer can improve the robustness of the neural networks. We explicitly construct a dense orthogonal weight matrix whose entries have the same magnitude, leading to a novel robust classifier. Our method is efficient and competitive to many state-of-the-art defensive approaches.
arXiv Detail & Related papers (2021-05-19T13:12:14Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
Improving Transformation-based Defenses against Adversarial Examples with First-order Perturbations [16.346349209014182]
Studies show that neural networks are susceptible to adversarial attacks. This exposes a potential threat to neural network-based intelligent systems. We propose a method for counteracting adversarial perturbations to improve adversarial robustness.
arXiv Detail & Related papers (2021-03-08T06:27:24Z)
Attribute-Guided Adversarial Training for Robustness to Natural Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z)
Feature Purification: How Adversarial Training Performs Robust Deep Learning [66.05472746340142]
We show a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network. We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly gradient descent indeed this principle.
arXiv Detail & Related papers (2020-05-20T16:56:08Z)
Depth-2 Neural Networks Under a Data-Poisoning Attack [2.105564340986074]
We study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup. In this work, we focus on doing supervised learning for a class of depth-2 finite-width neural networks.
arXiv Detail & Related papers (2020-05-04T17:56:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.