Adversarial Examples Might be Avoidable: The Role of Data Concentration in Adversarial Robustness
- URL: http://arxiv.org/abs/2309.16096v2
- Date: Sat, 25 May 2024 17:21:15 GMT
- Title: Adversarial Examples Might be Avoidable: The Role of Data Concentration in Adversarial Robustness
- Authors: Ambar Pal, Jeremias Sulam, René Vidal,
- Abstract summary: We show that concentration on small-volume subsets of the input space determines whether a robust classifier exists.
We further demonstrate that, for a data distribution concentrated on a union of low-dimensional linear subspaces, utilizing structure in data naturally leads to classifiers that enjoy data-dependent polyhedral guarantees.
- Score: 39.883465335244594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The susceptibility of modern machine learning classifiers to adversarial examples has motivated theoretical results suggesting that these might be unavoidable. However, these results can be too general to be applicable to natural data distributions. Indeed, humans are quite robust for tasks involving vision. This apparent conflict motivates a deeper dive into the question: Are adversarial examples truly unavoidable? In this work, we theoretically demonstrate that a key property of the data distribution -- concentration on small-volume subsets of the input space -- determines whether a robust classifier exists. We further demonstrate that, for a data distribution concentrated on a union of low-dimensional linear subspaces, utilizing structure in data naturally leads to classifiers that enjoy data-dependent polyhedral robustness guarantees, improving upon methods for provable certification in certain regimes.
Related papers
- Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - How adversarial attacks can disrupt seemingly stable accurate classifiers [76.95145661711514]
Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data.
Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data.
We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability.
arXiv Detail & Related papers (2023-09-07T12:02:00Z) - Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z) - Malign Overfitting: Interpolation Can Provably Preclude Invariance [30.776243638012314]
We show that "benign overfitting" in which models generalize well despite interpolating might not favorably extend to settings in which robustness or fairness are desirable.
We propose and analyze an algorithm that successfully learns a non-interpolating classifier that is provably invariant.
arXiv Detail & Related papers (2022-11-28T19:17:31Z) - Uncertainty in Contrastive Learning: On the Predictability of Downstream
Performance [7.411571833582691]
We study whether the uncertainty of such a representation can be quantified for a single datapoint in a meaningful way.
We show that this goal can be achieved by directly estimating the distribution of the training data in the embedding space.
arXiv Detail & Related papers (2022-07-19T15:44:59Z) - Benign Overfitting in Adversarially Robust Linear Classification [91.42259226639837]
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community.
We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
arXiv Detail & Related papers (2021-12-31T00:27:31Z) - Nonlinear Invariant Risk Minimization: A Causal Approach [5.63479133344366]
We propose a learning paradigm that enables out-of-distribution generalization in the nonlinear setting.
We show identifiability of the data representation up to very simple transformations.
Extensive experiments on both synthetic and real-world datasets show that our approach significantly outperforms a variety of baseline methods.
arXiv Detail & Related papers (2021-02-24T15:38:41Z) - Not All Datasets Are Born Equal: On Heterogeneous Data and Adversarial
Examples [46.625818815798254]
We argue that machine learning models trained on heterogeneous data are as susceptible to adversarial manipulations as those trained on homogeneous data.
We introduce a generic optimization framework for identifying adversarial perturbations in heterogeneous input spaces.
Our results demonstrate that despite the constraints imposed on input validity in heterogeneous datasets, machine learning models trained using such data are still equally susceptible to adversarial examples.
arXiv Detail & Related papers (2020-10-07T05:24:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.