Based-CE white-box adversarial attack will not work using super-fitting
- URL: http://arxiv.org/abs/2205.02741v1
- Date: Wed, 4 May 2022 09:23:00 GMT
- Title: Based-CE white-box adversarial attack will not work using super-fitting
- Authors: Youhuan Yang, Lei Sun, Leyu Dai, Song Guo, Xiuqing Mao, Xiaoqin Wang
and Bayi Xu
- Abstract summary: Deep Neural Networks (DNN) are widely used in various fields due to their powerful performance.
Recent studies have shown that deep learning models are vulnerable to adversarial attacks.
This paper proposes a new defense method by using the model super-fitting status.
- Score: 10.34121642283309
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNN) are widely used in various fields due to their
powerful performance, but recent studies have shown that deep learning models
are vulnerable to adversarial attacks-by adding a slight perturbation to the
input, the model will get wrong results. It is especially dangerous for some
systems with high security requirements, so this paper proposes a new defense
method by using the model super-fitting status. Model's adversarial robustness
(i.e., the accuracry under adversarial attack) has been greatly improved in
this status. This paper mathematically proves the effectiveness of
super-fitting, and proposes a method to make the model reach this status
quickly-minimaze unrelated categories scores (MUCS). Theoretically,
super-fitting can resist any existing (even future) Based on CE white-box
adversarial attack. In addition, this paper uses a variety of powerful attack
algorithms to evaluate the adversarial robustness of super-fitting and other
nearly 50 defense models from recent conferences. The experimental results show
that super-fitting method in this paper can make the trained model obtain the
highest adversarial performance robustness.
Related papers
- Robust Models are less Over-Confident [10.42820615166362]
adversarial training (AT) aims to achieve robustness against such attacks.
We empirically analyze a variety of adversarially trained models that achieve high robust accuracies.
AT has an interesting side-effect: it leads to models that are significantly less overconfident with their decisions.
arXiv Detail & Related papers (2022-10-12T06:14:55Z) - Alleviating Robust Overfitting of Adversarial Training With Consistency
Regularization [9.686724616328874]
Adversarial training (AT) has proven to be one of the most effective ways to defend Deep Neural Networks (DNNs) against adversarial attacks.
robustness will drop sharply at a certain stage, always exists during AT.
consistency regularization, a popular technique in semi-supervised learning, has a similar goal as AT and can be used to alleviate robust overfitting.
arXiv Detail & Related papers (2022-05-24T03:18:43Z) - Practical No-box Adversarial Attacks with Training-free Hybrid Image
Transformation [123.33816363589506]
We show the existence of a textbftraining-free adversarial perturbation under the no-box threat model.
Motivated by our observation that high-frequency component (HFC) domains in low-level features, we attack an image mainly by manipulating its frequency components.
Our method is even competitive to mainstream transfer-based black-box attacks.
arXiv Detail & Related papers (2022-03-09T09:51:00Z) - Mutual Adversarial Training: Learning together is better than going
alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation.
We propose mutual adversarial training (MAT) in which multiple models are trained together.
MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z) - Adversarial Attacks on ML Defense Models Competition [82.37504118766452]
The TSAIL group at Tsinghua University and the Alibaba Security group organized this competition.
The purpose of this competition is to motivate novel attack algorithms to evaluate adversarial robustness.
arXiv Detail & Related papers (2021-10-15T12:12:41Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - "What's in the box?!": Deflecting Adversarial Attacks by Randomly
Deploying Adversarially-Disjoint Models [71.91835408379602]
adversarial examples have been long considered a real threat to machine learning models.
We propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models.
arXiv Detail & Related papers (2021-02-09T20:07:13Z) - Adversarial Learning with Cost-Sensitive Classes [7.6596177815175475]
It is necessary to improve the performance of some special classes or to particularly protect them from attacks in adversarial learning.
This paper proposes a framework combining cost-sensitive classification and adversarial learning together to train a model that can distinguish between protected and unprotected classes.
arXiv Detail & Related papers (2021-01-29T03:15:40Z) - Detection Defense Against Adversarial Attacks with Saliency Map [7.736844355705379]
It is well established that neural networks are vulnerable to adversarial examples, which are almost imperceptible on human vision.
Existing defenses are trend to harden the robustness of models against adversarial attacks.
We propose a novel method combined with additional noises and utilize the inconsistency strategy to detect adversarial examples.
arXiv Detail & Related papers (2020-09-06T13:57:17Z) - Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised
Learning [71.17774313301753]
We explore the robustness of self-supervised learned high-level representations by using them in the defense against adversarial attacks.
Experimental results on the ASVspoof 2019 dataset demonstrate that high-level representations extracted by Mockingjay can prevent the transferability of adversarial examples.
arXiv Detail & Related papers (2020-06-05T03:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.