Learning Black-Box Attackers with Transferable Priors and Query Feedback
- URL: http://arxiv.org/abs/2010.11742v1
- Date: Wed, 21 Oct 2020 05:43:11 GMT
- Title: Learning Black-Box Attackers with Transferable Priors and Query Feedback
- Authors: Jiancheng Yang, Yangzhou Jiang, Xiaoyang Huang, Bingbing Ni, Chenglong
Zhao
- Abstract summary: This paper addresses the challenging black-box adversarial attack problem, where only classification confidence of a victim model is available.
Inspired by consistency of visual saliency between different vision models, a surrogate model is expected to improve the attack performance via transferability.
We propose a surprisingly simple baseline approach (named SimBA++) using the surrogate model, which significantly outperforms several state-of-the-art methods.
- Score: 40.41083684665537
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the challenging black-box adversarial attack problem,
where only classification confidence of a victim model is available. Inspired
by consistency of visual saliency between different vision models, a surrogate
model is expected to improve the attack performance via transferability. By
combining transferability-based and query-based black-box attack, we propose a
surprisingly simple baseline approach (named SimBA++) using the surrogate
model, which significantly outperforms several state-of-the-art methods.
Moreover, to efficiently utilize the query feedback, we update the surrogate
model in a novel learning scheme, named High-Order Gradient Approximation
(HOGA). By constructing a high-order gradient computation graph, we update the
surrogate model to approximate the victim model in both forward and backward
pass. The SimBA++ and HOGA result in Learnable Black-Box Attack (LeBA), which
surpasses previous state of the art by considerable margins: the proposed LeBA
significantly reduces queries, while keeping higher attack success rates close
to 100% in extensive ImageNet experiments, including attacking vision
benchmarks and defensive models. Code is open source at
https://github.com/TrustworthyDL/LeBA.
Related papers
- Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior [36.101904669291436]
This paper studies the challenging black-box adversarial attack that aims to generate examples against a black-box model by only using output feedback of the model to input queries.
We propose a Prior-guided Bayesian Optimization (P-BO) algorithm that leverages the surrogate model as a global function prior in black-box adversarial attacks.
Our theoretical analysis on the regret bound indicates that the performance of P-BO may be affected by a bad prior.
arXiv Detail & Related papers (2024-05-29T14:05:16Z) - Defense Against Model Extraction Attacks on Recommender Systems [53.127820987326295]
We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems.
GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model.
Results show GRO's superior effectiveness in defending against model extraction attacks.
arXiv Detail & Related papers (2023-10-25T03:30:42Z) - Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - T-SEA: Transfer-based Self-Ensemble Attack on Object Detection [9.794192858806905]
We propose a single-model transfer-based black-box attack on object detection, utilizing only one model to achieve a high-transferability adversarial attack on multiple black-box detectors.
We analogize patch optimization with regular model optimization, proposing a series of self-ensemble approaches on the input data, the attacked model, and the adversarial patch.
arXiv Detail & Related papers (2022-11-16T10:27:06Z) - Adversarial Pixel Restoration as a Pretext Task for Transferable
Perturbations [54.1807206010136]
Transferable adversarial attacks optimize adversaries from a pretrained surrogate model and known label space to fool the unknown black-box models.
We propose Adversarial Pixel Restoration as a self-supervised alternative to train an effective surrogate model from scratch.
Our training approach is based on a min-max objective which reduces overfitting via an adversarial objective.
arXiv Detail & Related papers (2022-07-18T17:59:58Z) - How to Robustify Black-Box ML Models? A Zeroth-Order Optimization
Perspective [74.47093382436823]
We address the problem of black-box defense: How to robustify a black-box model using just input queries and output feedback?
We propose a general notion of defensive operation that can be applied to black-box models, and design it through the lens of denoised smoothing (DS)
We empirically show that ZO-AE-DS can achieve improved accuracy, certified robustness, and query complexity over existing baselines.
arXiv Detail & Related papers (2022-03-27T03:23:32Z) - Black-box Adversarial Attacks in Autonomous Vehicle Technology [4.215251065887861]
Black-box adversarial attacks cause drastic misclassification in critical scene elements leading the autonomous vehicle to crash into other vehicles or pedestrians.
We propose a novel query-based attack method called Modified Simple black-box attack (M-SimBA) to overcome the use of a white-box source in transfer based attack method.
We show that the proposed model outperforms the existing models like Transfer-based projected gradient descent (T-PGD), SimBA in terms of convergence time, flattening the distribution of confused class probability, and producing adversarial samples with least confidence on the true class.
arXiv Detail & Related papers (2021-01-15T13:18:18Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.