Related papers: Linking average- and worst-case perturbation robustness via class selectivity and dimensionality

Linking average- and worst-case perturbation robustness via class selectivity and dimensionality

URL: http://arxiv.org/abs/2010.07693v2
Date: Mon, 29 Mar 2021 22:49:58 GMT
Title: Linking average- and worst-case perturbation robustness via class selectivity and dimensionality
Authors: Matthew L. Leavitt, Ari Morcos
Abstract summary: We investigate whether class selectivity confers robustness (or vulnerability) to perturbations of input data. We found that networks regularized to have lower levels of class selectivity were more robust to average-case perturbations. In contrast, class selectivity increases robustness to multiple types of worst-case perturbations.
Score: 7.360807642941714
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Representational sparsity is known to affect robustness to input perturbations in deep neural networks (DNNs), but less is known about how the semantic content of representations affects robustness. Class selectivity-the variability of a unit's responses across data classes or dimensions-is one way of quantifying the sparsity of semantic representations. Given recent evidence that class selectivity may not be necessary for, and in some cases can impair generalization, we investigate whether it also confers robustness (or vulnerability) to perturbations of input data. We found that networks regularized to have lower levels of class selectivity were more robust to average-case (naturalistic) perturbations, while networks with higher class selectivity are more vulnerable. In contrast, class selectivity increases robustness to multiple types of worst-case (i.e. white box adversarial) perturbations, suggesting that while decreasing class selectivity is helpful for average-case perturbations, it is harmful for worst-case perturbations. To explain this difference, we studied the dimensionality of the networks' representations: we found that the dimensionality of early-layer representations is inversely proportional to a network's class selectivity, and that adversarial samples cause a larger increase in early-layer dimensionality than corrupted samples. Furthermore, the input-unit gradient is more variable across samples and units in high-selectivity networks compared to low-selectivity networks. These results lead to the conclusion that units participate more consistently in low-selectivity regimes compared to high-selectivity regimes, effectively creating a larger attack surface and hence vulnerability to worst-case perturbations.

Related papers

Evaluating the Robustness of Deep-Learning Algorithm-Selection Models by Evolving Adversarial Instances [0.16874375111244325]
Deep convolutional networks (DNN) are increasingly being used to perform algorithm-selection in neural domains. adversarial samples are successfully generated from up to 56% of the original instances depending on the dataset. We use an evolutionary algorithm (EA) to find perturbations of instances from two existing benchmarks for online bin packing that cause trained DRNs to misclassify.
arXiv Detail & Related papers (2024-06-24T12:48:44Z)
The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks [64.12052498909105]
We study the implications of the implicit bias of gradient flow on generalization and adversarial robustness in ReLU networks. In two-layer ReLU networks gradient flow is biased towards solutions that generalize well, but are highly vulnerable to adversarial examples.
arXiv Detail & Related papers (2023-03-02T18:14:35Z)
Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks. This work proposes a data selection strategy to be applied in the mini-batch training. The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z)
Efficient and Robust Classification for Sparse Attacks [34.48667992227529]
We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection. We propose a novel defense method that consists of "truncation" and "adrial training" Motivated by the insights we obtain, we extend these components to neural network classifiers.
arXiv Detail & Related papers (2022-01-23T21:18:17Z)
SLA$^2$P: Self-supervised Anomaly Detection with Adversarial Perturbation [77.71161225100927]
Anomaly detection is a fundamental yet challenging problem in machine learning. We propose a novel and powerful framework, dubbed as SLA$2$P, for unsupervised anomaly detection.
arXiv Detail & Related papers (2021-11-25T03:53:43Z)
Learning Debiased and Disentangled Representations for Semantic Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation. By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes. Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z)
Analyzing Overfitting under Class Imbalance in Neural Networks for Image Segmentation [19.259574003403998]
In image segmentation neural networks may overfit to the foreground samples from small structures. In this study, we provide new insights on the problem of overfitting under class imbalance by inspecting the network behavior.
arXiv Detail & Related papers (2021-02-20T14:57:58Z)
Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator. Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples. We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z)
Attribute-Guided Adversarial Training for Robustness to Natural Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z)
On the relationship between class selectivity, dimensionality, and robustness [25.48362370177062]
We investigate whether class selectivity confers robustness (or vulnerability) to perturbations of input data. We found that mean class selectivity predicts vulnerability to naturalistic corruptions. We found that class selectivity increases robustness to multiple types of gradient-based adversarial attacks.
arXiv Detail & Related papers (2020-07-08T21:24:45Z)
Selectivity considered harmful: evaluating the causal impact of class selectivity in DNNs [7.360807642941714]
We investigate the causal impact of class selectivity on network function by directly regularizing for or against class selectivity. Using this regularizer to reduce class selectivity across units in convolutional neural networks increased test accuracy by over 2% for ResNet18 trained on Tiny ImageNet. For ResNet20 trained on CIFAR10 we could reduce class selectivity by a factor of 2.5 with no impact on test accuracy, and reduce it nearly to zero with only a small ($sim$2%) drop in test accuracy.
arXiv Detail & Related papers (2020-03-03T00:22:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.