Related papers: Order-Disorder: Imitation Adversarial Attacks for Black-box Neural Ranking Models

Order-Disorder: Imitation Adversarial Attacks for Black-box Neural Ranking Models

URL: http://arxiv.org/abs/2209.06506v2
Date: Tue, 18 Apr 2023 08:02:12 GMT
Title: Order-Disorder: Imitation Adversarial Attacks for Black-box Neural Ranking Models
Authors: Jiawei Liu, Yangyang Kang, Di Tang, Kaisong Song, Changlong Sun, Xiaofeng Wang, Wei Lu, Xiaozhong Liu
Abstract summary: We propose an imitation adversarial attack on black-box neural passage ranking models. We show that the target passage ranking model can be transparentized and imitated by enumerating critical queries/candidates. We also propose an innovative gradient-based attack method, empowered by the pairwise objective function, to generate adversarial triggers.
Score: 48.93128542994217
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural text ranking models have witnessed significant advancement and are increasingly being deployed in practice. Unfortunately, they also inherit adversarial vulnerabilities of general neural models, which have been detected but remain underexplored by prior studies. Moreover, the inherit adversarial vulnerabilities might be leveraged by blackhat SEO to defeat better-protected search engines. In this study, we propose an imitation adversarial attack on black-box neural passage ranking models. We first show that the target passage ranking model can be transparentized and imitated by enumerating critical queries/candidates and then train a ranking imitation model. Leveraging the ranking imitation model, we can elaborately manipulate the ranking results and transfer the manipulation attack to the target ranking model. For this purpose, we propose an innovative gradient-based attack method, empowered by the pairwise objective function, to generate adversarial triggers, which causes premeditated disorderliness with very few tokens. To equip the trigger camouflages, we add the next sentence prediction loss and the language model fluency constraint to the objective function. Experimental results on passage ranking demonstrate the effectiveness of the ranking imitation attack model and adversarial triggers against various SOTA neural ranking models. Furthermore, various mitigation analyses and human evaluation show the effectiveness of camouflages when facing potential mitigation approaches. To motivate other scholars to further investigate this novel and important problem, we make the experiment data and code publicly available.

Related papers

Attack-in-the-Chain: Bootstrapping Large Language Models for Attacks Against Black-box Neural Ranking Models [111.58315434849047]
We introduce a novel ranking attack framework named Attack-in-the-Chain. It tracks interactions between large language models (LLMs) and Neural ranking models (NRMs) based on chain-of-thought. Empirical results on two web search benchmarks show the effectiveness of our method.
arXiv Detail & Related papers (2024-12-25T04:03:09Z)
Black-box Adversarial Transferability: An Empirical Study in Cybersecurity Perspective [0.0]
In adversarial machine learning, malicious users try to fool the deep learning model by inserting adversarial perturbation inputs into the model during its training or testing phase. We empirically test the black-box adversarial transferability phenomena in cyber attack detection systems. The results indicate that any deep learning model is highly susceptible to adversarial attacks, even if the attacker does not have access to the internal details of the target model.
arXiv Detail & Related papers (2024-04-15T06:56:28Z)
Perturbation-Invariant Adversarial Training for Neural Ranking Models: Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents. This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs. In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z)
Defense Against Model Extraction Attacks on Recommender Systems [53.127820987326295]
We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems. GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model. Results show GRO's superior effectiveness in defending against model extraction attacks.
arXiv Detail & Related papers (2023-10-25T03:30:42Z)
A Tale of HodgeRank and Spectral Method: Target Attack Against Rank Aggregation Is the Fixed Point of Adversarial Game [153.74942025516853]
The intrinsic vulnerability of the rank aggregation methods is not well studied in the literature. In this paper, we focus on the purposeful adversary who desires to designate the aggregated results by modifying the pairwise data. The effectiveness of the suggested target attack strategies is demonstrated by a series of toy simulations and several real-world data experiments.
arXiv Detail & Related papers (2022-09-13T05:59:02Z)
What Does the Gradient Tell When Attacking the Graph Structure [44.44204591087092]
We present a theoretical demonstration revealing that attackers tend to increase inter-class edges due to the message passing mechanism of GNNs. By connecting dissimilar nodes, attackers can more effectively corrupt node features, making such attacks more advantageous. We propose an innovative attack loss that balances attack effectiveness and imperceptibility, sacrificing some attack effectiveness to attain greater imperceptibility.
arXiv Detail & Related papers (2022-08-26T15:45:20Z)
Adversarial Attack and Defense in Deep Ranking [100.17641539999055]
We propose two attacks against deep ranking systems that can raise or lower the rank of chosen candidates by adversarial perturbations. Conversely, an anti-collapse triplet defense is proposed to improve the ranking model robustness against all proposed attacks. Our adversarial ranking attacks and defenses are evaluated on MNIST, Fashion-MNIST, CUB200-2011, CARS196 and Stanford Online Products datasets.
arXiv Detail & Related papers (2021-06-07T13:41:45Z)
AdvHaze: Adversarial Haze Attack [19.744435173861785]
We introduce a novel adversarial attack method based on haze, which is a common phenomenon in real-world scenery. Our method can synthesize potentially adversarial haze into an image based on the atmospheric scattering model with high realisticity. We demonstrate that the proposed method achieves a high success rate, and holds better transferability across different classification models than the baselines.
arXiv Detail & Related papers (2021-04-28T09:52:25Z)
Practical Relative Order Attack in Deep Ranking [99.332629807873]
We formulate a new adversarial attack against deep ranking systems, i.e., the Order Attack. The Order Attack covertly alters the relative order among a selected set of candidates according to an attacker-specified permutation. It is successfully implemented on a major e-commerce platform.
arXiv Detail & Related papers (2021-03-09T06:41:18Z)
Luring of transferable adversarial perturbations in the black-box paradigm [0.0]
We present a new approach to improve the robustness of a model against black-box transfer attacks. A removable additional neural network is included in the target model, and is designed to induce the textitluring effect. Our deception-based method only needs to have access to the predictions of the target model and does not require a labeled data set.
arXiv Detail & Related papers (2020-04-10T06:48:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.