Related papers: Game-theoretic Understanding of Adversarially Learned Features

Game-theoretic Understanding of Adversarially Learned Features

URL: http://arxiv.org/abs/2103.07364v1
Date: Fri, 12 Mar 2021 15:56:28 GMT
Title: Game-theoretic Understanding of Adversarially Learned Features
Authors: Jie Ren, Die Zhang, Yisen Wang, Lu Chen, Zhanpeng Zhou, Xu Cheng, Xin Wang, Yiting Chen, Jie Shi, Quanshi Zhang
Abstract summary: This paper aims to understand adversarial attacks and defense from a new perspecitve, i.e., the signal-processing behavior of DNNs. We novelly define the multi-order interaction in game theory, which satisfies six properties.
Score: 38.19291233245746
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper aims to understand adversarial attacks and defense from a new perspecitve, i.e., the signal-processing behavior of DNNs. We novelly define the multi-order interaction in game theory, which satisfies six properties. With the multi-order interaction, we discover that adversarial attacks mainly affect high-order interactions to fool the DNN. Furthermore, we find that the robustness of adversarially trained DNNs comes from category-specific low-order interactions. Our findings provide more insights into and make a revision of previous understanding for the shape bias of adversarially learned features. Besides, the multi-order interaction can also explain the recoverability of adversarial examples.

Related papers

Joint Universal Adversarial Perturbations with Interpretations [19.140429650679593]
In this paper, we propose a novel attacking framework to generate joint universal adversarial perturbations (JUAP) To the best of our knowledge, this is the first effort to study UAP for jointly attacking both DNNs and interpretations.
arXiv Detail & Related papers (2024-08-03T08:58:04Z)
Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features [68.3512123520931]
We investigate the dynamics of a deep neural network (DNN) learning interactions. In this paper, we discover the DNN learns interactions in two phases. The first phase mainly penalizes interactions of medium and high orders, and the second phase mainly learns interactions of gradually increasing orders.
arXiv Detail & Related papers (2024-05-16T17:13:25Z)
A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions. The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model. This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z)
Investigating Human-Identifiable Features Hidden in Adversarial Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets. We identify human-identifiable features in adversarial perturbations. Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z)
Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detection [57.13665112065285]
Human-Object Interaction (HOI) detection is a challenging computer vision task. We present a framework that enhances HOI detection by incorporating structured text knowledge.
arXiv Detail & Related papers (2023-07-25T14:20:52Z)
Discovering and Explaining the Representation Bottleneck of DNNs [21.121270460158712]
This paper explores the bottleneck of feature representations of deep neural networks (DNNs) We focus on the multi-order interaction between input variables, where the order represents the complexity of interactions. We discover that a DNN is more likely to encode both too simple interactions and too complex interactions, but usually fails to learn interactions of intermediate complexity.
arXiv Detail & Related papers (2021-11-11T14:35:20Z)
A Unified Game-Theoretic Interpretation of Adversarial Robustness [39.64586231421121]
This paper provides a unified view to explain different adversarial attacks and defense methods. Our findings provide a potential method to unify adversarial perturbations and robustness, which can explain the existing defense methods in a principle way.
arXiv Detail & Related papers (2021-11-05T14:57:49Z)
Interpreting Attributions and Interactions of Adversarial Attacks [19.50612458496236]
This paper aims to explain adversarial attacks in terms of how adversarial perturbations contribute to the attacking task. We define and quantify interactions among adversarial perturbation pixels, and decompose the entire perturbation map into relatively independent perturbation components.
arXiv Detail & Related papers (2021-08-16T04:59:39Z)
Recent Advances in Understanding Adversarial Robustness of Deep Neural Networks [15.217367754000913]
It is increasingly important to obtain models with high robustness that are resistant to adversarial examples. We give preliminary definitions on what adversarial attacks and robustness are. We study frequently-used benchmarks and mention theoretically-proved bounds for adversarial robustness.
arXiv Detail & Related papers (2020-11-03T07:42:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.