Defending Substitution-Based Profile Pollution Attacks on Sequential
Recommenders
- URL: http://arxiv.org/abs/2207.11237v1
- Date: Tue, 19 Jul 2022 00:19:13 GMT
- Title: Defending Substitution-Based Profile Pollution Attacks on Sequential
Recommenders
- Authors: Zhenrui Yue, Huimin Zeng, Ziyi Kou, Lanyu Shang, Dong Wang
- Abstract summary: We propose a substitution-based adversarial attack algorithm, which modifies the input sequence by selecting certain vulnerable elements and substituting them with adversarial items.
We also design an efficient adversarial defense method called Dirichlet neighborhood sampling.
In particular, we represent selected items with one-hot encodings and perform gradient ascent on the encodings to search for the worst case linear combination of item embeddings in training.
- Score: 8.828396559882954
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While sequential recommender systems achieve significant improvements on
capturing user dynamics, we argue that sequential recommenders are vulnerable
against substitution-based profile pollution attacks. To demonstrate our
hypothesis, we propose a substitution-based adversarial attack algorithm, which
modifies the input sequence by selecting certain vulnerable elements and
substituting them with adversarial items. In both untargeted and targeted
attack scenarios, we observe significant performance deterioration using the
proposed profile pollution algorithm. Motivated by such observations, we design
an efficient adversarial defense method called Dirichlet neighborhood sampling.
Specifically, we sample item embeddings from a convex hull constructed by
multi-hop neighbors to replace the original items in input sequences. During
sampling, a Dirichlet distribution is used to approximate the probability
distribution in the neighborhood such that the recommender learns to combat
local perturbations. Additionally, we design an adversarial training method
tailored for sequential recommender systems. In particular, we represent
selected items with one-hot encodings and perform gradient ascent on the
encodings to search for the worst case linear combination of item embeddings in
training. As such, the embedding function learns robust item representations
and the trained recommender is resistant to test-time adversarial examples.
Extensive experiments show the effectiveness of both our attack and defense
methods, which consistently outperform baselines by a significant margin across
model architectures and datasets.
Related papers
- Prototypical Contrastive Learning through Alignment and Uniformity for
Recommendation [6.790779112538357]
We present underlinePrototypical contrastive learning through underlineAlignment and underlineUniformity for recommendation.
Specifically, we first propose prototypes as a latent space to ensure consistency across different augmentations from the origin graph.
The absence of explicit negatives means that directly optimizing the consistency loss between instance and prototype could easily result in dimensional collapse issues.
arXiv Detail & Related papers (2024-02-03T08:19:26Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Token-Level Adversarial Prompt Detection Based on Perplexity Measures
and Contextual Information [67.78183175605761]
Large Language Models are susceptible to adversarial prompt attacks.
This vulnerability underscores a significant concern regarding the robustness and reliability of LLMs.
We introduce a novel approach to detecting adversarial prompts at a token level.
arXiv Detail & Related papers (2023-11-20T03:17:21Z) - Enhancing Adversarial Robustness via Score-Based Optimization [22.87882885963586]
Adversarial attacks have the potential to mislead deep neural network classifiers by introducing slight perturbations.
We introduce a novel adversarial defense scheme named ScoreOpt, which optimize adversarial samples at test-time.
Our experimental results demonstrate that our approach outperforms existing adversarial defenses in terms of both performance and robustness speed.
arXiv Detail & Related papers (2023-07-10T03:59:42Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete
Sequential Data via Bayesian Optimization [10.246596695310176]
We focus on the problem of adversarial attacks against models on discrete sequential data in the black-box setting.
We propose a query-efficient black-box attack using Bayesian optimization, which dynamically computes important positions.
We develop a post-optimization algorithm that finds adversarial examples with smaller perturbation size.
arXiv Detail & Related papers (2022-06-17T06:11:36Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - ATRO: Adversarial Training with a Rejection Option [10.36668157679368]
This paper proposes a classification framework with a rejection option to mitigate the performance deterioration caused by adversarial examples.
Applying the adversarial training objective to both a classifier and a rejection function simultaneously, we can choose to abstain from classification when it has insufficient confidence to classify a test data point.
arXiv Detail & Related papers (2020-10-24T14:05:03Z) - Protecting Classifiers From Attacks. A Bayesian Approach [0.9449650062296823]
We provide an alternative Bayesian framework that accounts for the lack of precise knowledge about the attacker's behavior using adversarial risk analysis.
We propose a sampling procedure based on approximate Bayesian computation, in which we simulate the attacker's problem taking into account our uncertainty about his elements.
For large scale problems, we propose an alternative, scalable approach that could be used when dealing with differentiable classifiers.
arXiv Detail & Related papers (2020-04-18T21:21:56Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.