Defense Against Model Extraction Attacks on Recommender Systems
- URL: http://arxiv.org/abs/2310.16335v1
- Date: Wed, 25 Oct 2023 03:30:42 GMT
- Title: Defense Against Model Extraction Attacks on Recommender Systems
- Authors: Sixiao Zhang, Hongzhi Yin, Hongxu Chen, Cheng Long
- Abstract summary: We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems.
GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model.
Results show GRO's superior effectiveness in defending against model extraction attacks.
- Score: 53.127820987326295
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The robustness of recommender systems has become a prominent topic within the
research community. Numerous adversarial attacks have been proposed, but most
of them rely on extensive prior knowledge, such as all the white-box attacks or
most of the black-box attacks which assume that certain external knowledge is
available. Among these attacks, the model extraction attack stands out as a
promising and practical method, involving training a surrogate model by
repeatedly querying the target model. However, there is a significant gap in
the existing literature when it comes to defending against model extraction
attacks on recommender systems. In this paper, we introduce Gradient-based
Ranking Optimization (GRO), which is the first defense strategy designed to
counter such attacks. We formalize the defense as an optimization problem,
aiming to minimize the loss of the protected target model while maximizing the
loss of the attacker's surrogate model. Since top-k ranking lists are
non-differentiable, we transform them into swap matrices which are instead
differentiable. These swap matrices serve as input to a student model that
emulates the surrogate model's behavior. By back-propagating the loss of the
student model, we obtain gradients for the swap matrices. These gradients are
used to compute a swap loss, which maximizes the loss of the student model. We
conducted experiments on three benchmark datasets to evaluate the performance
of GRO, and the results demonstrate its superior effectiveness in defending
against model extraction attacks.
Related papers
- Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior [36.101904669291436]
This paper studies the challenging black-box adversarial attack that aims to generate examples against a black-box model by only using output feedback of the model to input queries.
We propose a Prior-guided Bayesian Optimization (P-BO) algorithm that leverages the surrogate model as a global function prior in black-box adversarial attacks.
Our theoretical analysis on the regret bound indicates that the performance of P-BO may be affected by a bad prior.
arXiv Detail & Related papers (2024-05-29T14:05:16Z) - Transferable Attack for Semantic Segmentation [59.17710830038692]
adversarial attacks, and observe that the adversarial examples generated from a source model fail to attack the target models.
We propose an ensemble attack for semantic segmentation to achieve more effective attacks with higher transferability.
arXiv Detail & Related papers (2023-07-31T11:05:55Z) - Introducing Foundation Models as Surrogate Models: Advancing Towards
More Practical Adversarial Attacks [15.882687207499373]
No-box adversarial attacks are becoming more practical and challenging for AI systems.
This paper recasts adversarial attack as a downstream task by introducing foundational models as surrogate models.
arXiv Detail & Related papers (2023-07-13T08:10:48Z) - Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - Order-Disorder: Imitation Adversarial Attacks for Black-box Neural
Ranking Models [48.93128542994217]
We propose an imitation adversarial attack on black-box neural passage ranking models.
We show that the target passage ranking model can be transparentized and imitated by enumerating critical queries/candidates.
We also propose an innovative gradient-based attack method, empowered by the pairwise objective function, to generate adversarial triggers.
arXiv Detail & Related papers (2022-09-14T09:10:07Z) - The Space of Adversarial Strategies [6.295859509997257]
Adversarial examples, inputs designed to induce worst-case behavior in machine learning models, have been extensively studied over the past decade.
We propose a systematic approach to characterize worst-case (i.e., optimal) adversaries.
arXiv Detail & Related papers (2022-09-09T20:53:11Z) - Adversarial Pixel Restoration as a Pretext Task for Transferable
Perturbations [54.1807206010136]
Transferable adversarial attacks optimize adversaries from a pretrained surrogate model and known label space to fool the unknown black-box models.
We propose Adversarial Pixel Restoration as a self-supervised alternative to train an effective surrogate model from scratch.
Our training approach is based on a min-max objective which reduces overfitting via an adversarial objective.
arXiv Detail & Related papers (2022-07-18T17:59:58Z) - BODAME: Bilevel Optimization for Defense Against Model Extraction [10.877450596327407]
We consider an adversarial setting to prevent model extraction under the assumption that will make best guess on the service provider's attacker.
We formulate a surrogate model using the predictions of the true model.
We give a tractable transformation and an algorithm for more complicated models that are learned by using gradient descent-based algorithms.
arXiv Detail & Related papers (2021-03-11T17:08:31Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.