Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness
- URL: http://arxiv.org/abs/2005.00060v2
- Date: Fri, 3 Jul 2020 03:49:28 GMT
- Title: Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness
- Authors: Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, Xue
Lin
- Abstract summary: We use mode connectivity to study the adversarial robustness of deep neural networks.
Our experiments cover various types of adversarial attacks applied to different network architectures and datasets.
Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.
- Score: 97.67477497115163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mode connectivity provides novel geometric insights on analyzing loss
landscapes and enables building high-accuracy pathways between well-trained
neural networks. In this work, we propose to employ mode connectivity in loss
landscapes to study the adversarial robustness of deep neural networks, and
provide novel methods for improving this robustness. Our experiments cover
various types of adversarial attacks applied to different network architectures
and datasets. When network models are tampered with backdoor or error-injection
attacks, our results demonstrate that the path connection learned using limited
amount of bonafide data can effectively mitigate adversarial effects while
maintaining the original accuracy on clean data. Therefore, mode connectivity
provides users with the power to repair backdoored or error-injected models. We
also use mode connectivity to investigate the loss landscapes of regular and
robust models against evasion attacks. Experiments show that there exists a
barrier in adversarial robustness loss on the path connecting regular and
adversarially-trained models. A high correlation is observed between the
adversarial robustness loss and the largest eigenvalue of the input Hessian
matrix, for which theoretical justifications are provided. Our results suggest
that mode connectivity offers a holistic tool and practical means for
evaluating and improving adversarial robustness.
Related papers
- Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data [38.44734564565478]
We provide a theoretical understanding of adversarial examples and adversarial training algorithms from the perspective of feature learning theory.
We show that the adversarial training method can provably strengthen the robust feature learning and suppress the non-robust feature learning.
arXiv Detail & Related papers (2024-10-11T03:59:49Z) - A Geometrical Approach to Evaluate the Adversarial Robustness of Deep
Neural Networks [52.09243852066406]
Adversarial Converging Time Score (ACTS) measures the converging time as an adversarial robustness metric.
We validate the effectiveness and generalization of the proposed ACTS metric against different adversarial attacks on the large-scale ImageNet dataset.
arXiv Detail & Related papers (2023-10-10T09:39:38Z) - Masking Adversarial Damage: Finding Adversarial Saliency for Robust and
Sparse Network [33.18197518590706]
Adversarial examples provoke weak reliability and potential security issues in deep neural networks.
We propose a novel adversarial pruning method, Masking Adversarial Damage (MAD) that employs second-order information of adversarial loss.
We show that MAD effectively prunes adversarially trained networks without loosing adversarial robustness and shows better performance than previous adversarial pruning methods.
arXiv Detail & Related papers (2022-04-06T11:28:06Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Pruning in the Face of Adversaries [0.0]
We evaluate the impact of neural network pruning on the adversarial robustness against L-0, L-2 and L-infinity attacks.
Our results confirm that neural network pruning and adversarial robustness are not mutually exclusive.
We extend our analysis to situations that incorporate additional assumptions on the adversarial scenario and show that depending on the situation, different strategies are optimal.
arXiv Detail & Related papers (2021-08-19T09:06:16Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Improving Adversarial Robustness by Enforcing Local and Global
Compactness [19.8818435601131]
Adversary training is the most successful method that consistently resists a wide range of attacks.
We propose the Adversary Divergence Reduction Network which enforces local/global compactness and the clustering assumption.
The experimental results demonstrate that augmenting adversarial training with our proposed components can further improve the robustness of the network.
arXiv Detail & Related papers (2020-07-10T00:43:06Z) - Architectural Resilience to Foreground-and-Background Adversarial Noise [0.0]
Adrial attacks in the form of imperceptible perturbations of normal images have been extensively studied.
We propose distinct model-agnostic benchmark perturbations of images to investigate resilience and robustness of different network architectures.
arXiv Detail & Related papers (2020-03-23T01:38:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.