Game-theoretic Understanding of Adversarially Learned Features
- URL: http://arxiv.org/abs/2103.07364v1
- Date: Fri, 12 Mar 2021 15:56:28 GMT
- Title: Game-theoretic Understanding of Adversarially Learned Features
- Authors: Jie Ren, Die Zhang, Yisen Wang, Lu Chen, Zhanpeng Zhou, Xu Cheng, Xin
Wang, Yiting Chen, Jie Shi, Quanshi Zhang
- Abstract summary: This paper aims to understand adversarial attacks and defense from a new perspecitve, i.e., the signal-processing behavior of DNNs.
We novelly define the multi-order interaction in game theory, which satisfies six properties.
- Score: 38.19291233245746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper aims to understand adversarial attacks and defense from a new
perspecitve, i.e., the signal-processing behavior of DNNs. We novelly define
the multi-order interaction in game theory, which satisfies six properties.
With the multi-order interaction, we discover that adversarial attacks mainly
affect high-order interactions to fool the DNN. Furthermore, we find that the
robustness of adversarially trained DNNs comes from category-specific low-order
interactions. Our findings provide more insights into and make a revision of
previous understanding for the shape bias of adversarially learned features.
Besides, the multi-order interaction can also explain the recoverability of
adversarial examples.
Related papers
- Joint Universal Adversarial Perturbations with Interpretations [19.140429650679593]
In this paper, we propose a novel attacking framework to generate joint universal adversarial perturbations (JUAP)
To the best of our knowledge, this is the first effort to study UAP for jointly attacking both DNNs and interpretations.
arXiv Detail & Related papers (2024-08-03T08:58:04Z) - Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features [68.3512123520931]
We investigate the dynamics of a deep neural network (DNN) learning interactions.
In this paper, we discover the DNN learns interactions in two phases.
The first phase mainly penalizes interactions of medium and high orders, and the second phase mainly learns interactions of gradually increasing orders.
arXiv Detail & Related papers (2024-05-16T17:13:25Z) - A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions.
The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model.
This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z) - Investigating Human-Identifiable Features Hidden in Adversarial
Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets.
We identify human-identifiable features in adversarial perturbations.
Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z) - Re-mine, Learn and Reason: Exploring the Cross-modal Semantic
Correlations for Language-guided HOI detection [57.13665112065285]
Human-Object Interaction (HOI) detection is a challenging computer vision task.
We present a framework that enhances HOI detection by incorporating structured text knowledge.
arXiv Detail & Related papers (2023-07-25T14:20:52Z) - Discovering and Explaining the Representation Bottleneck of DNNs [21.121270460158712]
This paper explores the bottleneck of feature representations of deep neural networks (DNNs)
We focus on the multi-order interaction between input variables, where the order represents the complexity of interactions.
We discover that a DNN is more likely to encode both too simple interactions and too complex interactions, but usually fails to learn interactions of intermediate complexity.
arXiv Detail & Related papers (2021-11-11T14:35:20Z) - A Unified Game-Theoretic Interpretation of Adversarial Robustness [39.64586231421121]
This paper provides a unified view to explain different adversarial attacks and defense methods.
Our findings provide a potential method to unify adversarial perturbations and robustness, which can explain the existing defense methods in a principle way.
arXiv Detail & Related papers (2021-11-05T14:57:49Z) - Interpreting Attributions and Interactions of Adversarial Attacks [19.50612458496236]
This paper aims to explain adversarial attacks in terms of how adversarial perturbations contribute to the attacking task.
We define and quantify interactions among adversarial perturbation pixels, and decompose the entire perturbation map into relatively independent perturbation components.
arXiv Detail & Related papers (2021-08-16T04:59:39Z) - Recent Advances in Understanding Adversarial Robustness of Deep Neural
Networks [15.217367754000913]
It is increasingly important to obtain models with high robustness that are resistant to adversarial examples.
We give preliminary definitions on what adversarial attacks and robustness are.
We study frequently-used benchmarks and mention theoretically-proved bounds for adversarial robustness.
arXiv Detail & Related papers (2020-11-03T07:42:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.