Adversarially Robust Models may not Transfer Better: Sufficient
Conditions for Domain Transferability from the View of Regularization
- URL: http://arxiv.org/abs/2202.01832v1
- Date: Thu, 3 Feb 2022 20:26:27 GMT
- Title: Adversarially Robust Models may not Transfer Better: Sufficient
Conditions for Domain Transferability from the View of Regularization
- Authors: Xiaojun Xu, Jacky Yibo Zhang, Evelyn Ma, Danny Son, Oluwasanmi Koyejo,
Bo Li
- Abstract summary: Machine learning robustness and domain generalization are fundamentally correlated.
Recent studies show that more robust (adversarially trained) models are more generalizable.
There is a lack of theoretical understanding of their fundamental connections.
- Score: 17.825841580342715
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) robustness and domain generalization are fundamentally
correlated: they essentially concern data distribution shifts under adversarial
and natural settings, respectively. On one hand, recent studies show that more
robust (adversarially trained) models are more generalizable. On the other
hand, there is a lack of theoretical understanding of their fundamental
connections. In this paper, we explore the relationship between regularization
and domain transferability considering different factors such as norm
regularization and data augmentations (DA). We propose a general theoretical
framework proving that factors involving the model function class
regularization are sufficient conditions for relative domain transferability.
Our analysis implies that "robustness" is neither necessary nor sufficient for
transferability; rather, robustness induced by adversarial training is a
by-product of such function class regularization. We then discuss popular DA
protocols and show when they can be viewed as the function class regularization
under certain conditions and therefore improve generalization. We conduct
extensive experiments to verify our theoretical findings and show several
counterexamples where robustness and generalization are negatively correlated
on different datasets.
Related papers
- Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing [55.791818510796645]
We aim to develop models that generalize well to any diverse test distribution, even if the latter deviates significantly from the training data.
Various approaches like domain adaptation, domain generalization, and robust optimization attempt to address the out-of-distribution challenge.
We adopt a more conservative perspective by accounting for the worst-case error across all sufficiently diverse test distributions within a known domain.
arXiv Detail & Related papers (2024-10-08T12:26:48Z) - Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Domain Adaptation with Cauchy-Schwarz Divergence [39.36943882475589]
We introduce Cauchy-Schwarz divergence to the problem of unsupervised domain adaptation (UDA)
The CS divergence offers a theoretically tighter generalization error bound than the popular Kullback-Leibler divergence.
We show how the CS divergence can be conveniently used in both distance metric- or adversarial training-based UDA frameworks.
arXiv Detail & Related papers (2024-05-30T12:01:12Z) - Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - Trade-off Between Dependence and Complexity for Nonparametric Learning
-- an Empirical Process Approach [10.27974860479791]
In many applications where the data exhibit temporal dependencies, the corresponding empirical processes are much less understood.
We present a general bound on the expected supremum of empirical processes under standard $beta/rho$-mixing assumptions.
We show that even under long-range dependence, it is possible to attain the same rates as in the i.i.d. setting.
arXiv Detail & Related papers (2024-01-17T05:08:37Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - Robustness Implies Generalization via Data-Dependent Generalization
Bounds [24.413499775513145]
This paper proves that robustness implies generalization via data-dependent generalization bounds.
We present several examples, including ones for lasso and deep learning, in which our bounds are provably preferable.
arXiv Detail & Related papers (2022-06-27T17:58:06Z) - Causality Inspired Representation Learning for Domain Generalization [47.574964496891404]
We introduce a general structural causal model to formalize the Domain generalization problem.
Our goal is to extract the causal factors from inputs and then reconstruct the invariant causal mechanisms.
We highlight that ideal causal factors should meet three basic properties: separated from the non-causal ones, jointly independent, and causally sufficient for the classification.
arXiv Detail & Related papers (2022-03-27T08:08:33Z) - Measuring Generalization with Optimal Transport [111.29415509046886]
We develop margin-based generalization bounds, where the margins are normalized with optimal transport costs.
Our bounds robustly predict the generalization error, given training data and network parameters, on large scale datasets.
arXiv Detail & Related papers (2021-06-07T03:04:59Z) - Generalised Lipschitz Regularisation Equals Distributional Robustness [47.44261811369141]
We give a very general equality result regarding the relationship between distributional robustness and regularisation.
We show a new result explicating the connection between adversarial learning and distributional robustness.
arXiv Detail & Related papers (2020-02-11T04:19:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.