An Analysis of Robustness of Non-Lipschitz Networks
- URL: http://arxiv.org/abs/2010.06154v4
- Date: Tue, 18 Apr 2023 18:16:11 GMT
- Title: An Analysis of Robustness of Non-Lipschitz Networks
- Authors: Maria-Florina Balcan and Avrim Blum and Dravyansh Sharma and Hongyang
Zhang
- Abstract summary: Small input perturbations can often produce large movements in the network's final-layer feature space.
In our model, the adversary may move data an arbitrary distance in feature space but only in random low-dimensional subspaces.
We provide theoretical guarantees for setting algorithm parameters to optimize over accuracy-abstention trade-offs using data-driven methods.
- Score: 35.64511156980701
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite significant advances, deep networks remain highly susceptible to
adversarial attack. One fundamental challenge is that small input perturbations
can often produce large movements in the network's final-layer feature space.
In this paper, we define an attack model that abstracts this challenge, to help
understand its intrinsic properties. In our model, the adversary may move data
an arbitrary distance in feature space but only in random low-dimensional
subspaces. We prove such adversaries can be quite powerful: defeating any
algorithm that must classify any input it is given. However, by allowing the
algorithm to abstain on unusual inputs, we show such adversaries can be
overcome when classes are reasonably well-separated in feature space. We
further provide strong theoretical guarantees for setting algorithm parameters
to optimize over accuracy-abstention trade-offs using data-driven methods. Our
results provide new robustness guarantees for nearest-neighbor style
algorithms, and also have application to contrastive learning, where we
empirically demonstrate the ability of such algorithms to obtain high robust
accuracy with low abstention rates. Our model is also motivated by strategic
classification, where entities being classified aim to manipulate their
observable features to produce a preferred classification, and we provide new
insights into that area as well.
Related papers
- Provable Optimization for Adversarial Fair Self-supervised Contrastive Learning [49.417414031031264]
This paper studies learning fair encoders in a self-supervised learning setting.
All data are unlabeled and only a small portion of them are annotated with sensitive attributes.
arXiv Detail & Related papers (2024-06-09T08:11:12Z) - Sparse and Transferable Universal Singular Vectors Attack [5.498495800909073]
We propose a novel sparse universal white-box adversarial attack.
Our approach is based on truncated power providing sparsity to $(p,q)$-singular vectors of the hidden layers of Jacobian matrices.
Our findings demonstrate the vulnerability of state-of-the-art models to sparse attacks and highlight the importance of developing robust machine learning systems.
arXiv Detail & Related papers (2024-01-25T09:21:29Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Wasserstein distributional robustness of neural networks [9.79503506460041]
Deep neural networks are known to be vulnerable to adversarial attacks (AA)
For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified.
We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions.
arXiv Detail & Related papers (2023-06-16T13:41:24Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - Information Obfuscation of Graph Neural Networks [96.8421624921384]
We study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data.
We propose a framework to locally filter out pre-determined sensitive attributes via adversarial training with the total variation and the Wasserstein distance.
arXiv Detail & Related papers (2020-09-28T17:55:04Z) - Second Order Optimization for Adversarial Robustness and
Interpretability [6.700873164609009]
We propose a novel regularizer which incorporates first and second order information via a quadratic approximation to the adversarial loss.
It is shown that using only a single iteration in our regularizer achieves stronger robustness than prior gradient and curvature regularization schemes.
It retains the interesting facet of AT that networks learn features which are well-aligned with human perception.
arXiv Detail & Related papers (2020-09-10T15:05:14Z) - A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms.
We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z) - Opportunities and Challenges in Deep Learning Adversarial Robustness: A
Survey [1.8782750537161614]
This paper studies strategies to implement adversary robustly trained algorithms towards guaranteeing safety in machine learning algorithms.
We provide a taxonomy to classify adversarial attacks and defenses, formulate the Robust Optimization problem in a min-max setting, and divide it into 3 subcategories, namely: Adversarial (re)Training, Regularization Approach, and Certified Defenses.
arXiv Detail & Related papers (2020-07-01T21:00:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.