Linking average- and worst-case perturbation robustness via class
selectivity and dimensionality
- URL: http://arxiv.org/abs/2010.07693v2
- Date: Mon, 29 Mar 2021 22:49:58 GMT
- Title: Linking average- and worst-case perturbation robustness via class
selectivity and dimensionality
- Authors: Matthew L. Leavitt, Ari Morcos
- Abstract summary: We investigate whether class selectivity confers robustness (or vulnerability) to perturbations of input data.
We found that networks regularized to have lower levels of class selectivity were more robust to average-case perturbations.
In contrast, class selectivity increases robustness to multiple types of worst-case perturbations.
- Score: 7.360807642941714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Representational sparsity is known to affect robustness to input
perturbations in deep neural networks (DNNs), but less is known about how the
semantic content of representations affects robustness. Class selectivity-the
variability of a unit's responses across data classes or dimensions-is one way
of quantifying the sparsity of semantic representations. Given recent evidence
that class selectivity may not be necessary for, and in some cases can impair
generalization, we investigate whether it also confers robustness (or
vulnerability) to perturbations of input data. We found that networks
regularized to have lower levels of class selectivity were more robust to
average-case (naturalistic) perturbations, while networks with higher class
selectivity are more vulnerable. In contrast, class selectivity increases
robustness to multiple types of worst-case (i.e. white box adversarial)
perturbations, suggesting that while decreasing class selectivity is helpful
for average-case perturbations, it is harmful for worst-case perturbations. To
explain this difference, we studied the dimensionality of the networks'
representations: we found that the dimensionality of early-layer
representations is inversely proportional to a network's class selectivity, and
that adversarial samples cause a larger increase in early-layer dimensionality
than corrupted samples. Furthermore, the input-unit gradient is more variable
across samples and units in high-selectivity networks compared to
low-selectivity networks. These results lead to the conclusion that units
participate more consistently in low-selectivity regimes compared to
high-selectivity regimes, effectively creating a larger attack surface and
hence vulnerability to worst-case perturbations.
Related papers
- Evaluating the Robustness of Deep-Learning Algorithm-Selection Models by Evolving Adversarial Instances [0.16874375111244325]
Deep convolutional networks (DNN) are increasingly being used to perform algorithm-selection in neural domains.
adversarial samples are successfully generated from up to 56% of the original instances depending on the dataset.
We use an evolutionary algorithm (EA) to find perturbations of instances from two existing benchmarks for online bin packing that cause trained DRNs to misclassify.
arXiv Detail & Related papers (2024-06-24T12:48:44Z) - The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness
in ReLU Networks [64.12052498909105]
We study the implications of the implicit bias of gradient flow on generalization and adversarial robustness in ReLU networks.
In two-layer ReLU networks gradient flow is biased towards solutions that generalize well, but are highly vulnerable to adversarial examples.
arXiv Detail & Related papers (2023-03-02T18:14:35Z) - Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks.
This work proposes a data selection strategy to be applied in the mini-batch training.
The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z) - Efficient and Robust Classification for Sparse Attacks [34.48667992227529]
We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection.
We propose a novel defense method that consists of "truncation" and "adrial training"
Motivated by the insights we obtain, we extend these components to neural network classifiers.
arXiv Detail & Related papers (2022-01-23T21:18:17Z) - SLA$^2$P: Self-supervised Anomaly Detection with Adversarial
Perturbation [77.71161225100927]
Anomaly detection is a fundamental yet challenging problem in machine learning.
We propose a novel and powerful framework, dubbed as SLA$2$P, for unsupervised anomaly detection.
arXiv Detail & Related papers (2021-11-25T03:53:43Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Analyzing Overfitting under Class Imbalance in Neural Networks for Image
Segmentation [19.259574003403998]
In image segmentation neural networks may overfit to the foreground samples from small structures.
In this study, we provide new insights on the problem of overfitting under class imbalance by inspecting the network behavior.
arXiv Detail & Related papers (2021-02-20T14:57:58Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - On the relationship between class selectivity, dimensionality, and
robustness [25.48362370177062]
We investigate whether class selectivity confers robustness (or vulnerability) to perturbations of input data.
We found that mean class selectivity predicts vulnerability to naturalistic corruptions.
We found that class selectivity increases robustness to multiple types of gradient-based adversarial attacks.
arXiv Detail & Related papers (2020-07-08T21:24:45Z) - Selectivity considered harmful: evaluating the causal impact of class
selectivity in DNNs [7.360807642941714]
We investigate the causal impact of class selectivity on network function by directly regularizing for or against class selectivity.
Using this regularizer to reduce class selectivity across units in convolutional neural networks increased test accuracy by over 2% for ResNet18 trained on Tiny ImageNet.
For ResNet20 trained on CIFAR10 we could reduce class selectivity by a factor of 2.5 with no impact on test accuracy, and reduce it nearly to zero with only a small ($sim$2%) drop in test accuracy.
arXiv Detail & Related papers (2020-03-03T00:22:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.