Black-box Adversarial Attacks against Dense Retrieval Models: A
Multi-view Contrastive Learning Method
- URL: http://arxiv.org/abs/2308.09861v1
- Date: Sat, 19 Aug 2023 00:24:59 GMT
- Title: Black-box Adversarial Attacks against Dense Retrieval Models: A
Multi-view Contrastive Learning Method
- Authors: Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen,
Yixing Fan, Xueqi Cheng
- Abstract summary: We introduce the adversarial retrieval attack (AREA) task.
It is meant to trick DR models into retrieving a target document that is outside the initial set of candidate documents retrieved by the DR model.
We find that the promising results that have previously been reported on attacking NRMs, do not generalize to DR models.
We propose to formalize attacks on DR models as a contrastive learning problem in a multi-view representation space.
- Score: 115.29382166356478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural ranking models (NRMs) and dense retrieval (DR) models have given rise
to substantial improvements in overall retrieval performance. In addition to
their effectiveness, and motivated by the proven lack of robustness of deep
learning-based approaches in other areas, there is growing interest in the
robustness of deep learning-based approaches to the core retrieval problem.
Adversarial attack methods that have so far been developed mainly focus on
attacking NRMs, with very little attention being paid to the robustness of DR
models. In this paper, we introduce the adversarial retrieval attack (AREA)
task. The AREA task is meant to trick DR models into retrieving a target
document that is outside the initial set of candidate documents retrieved by
the DR model in response to a query. We consider the decision-based black-box
adversarial setting, which is realistic in real-world search engines. To
address the AREA task, we first employ existing adversarial attack methods
designed for NRMs. We find that the promising results that have previously been
reported on attacking NRMs, do not generalize to DR models: these methods
underperform a simple term spamming method. We attribute the observed lack of
generalizability to the interaction-focused architecture of NRMs, which
emphasizes fine-grained relevance matching. DR models follow a different
representation-focused architecture that prioritizes coarse-grained
representations. We propose to formalize attacks on DR models as a contrastive
learning problem in a multi-view representation space. The core idea is to
encourage the consistency between each view representation of the target
document and its corresponding viewer via view-wise supervision signals.
Experimental results demonstrate that the proposed method can significantly
outperform existing attack strategies in misleading the DR model with small
indiscernible text perturbations.
Related papers
- Unleashing the Power of Generic Segmentation Models: A Simple Baseline for Infrared Small Target Detection [57.666055329221194]
We investigate the adaptation of generic segmentation models, such as the Segment Anything Model (SAM), to infrared small object detection tasks.
Our model demonstrates significantly improved performance in both accuracy and throughput compared to existing approaches.
arXiv Detail & Related papers (2024-09-07T05:31:24Z) - Adversarial Robustness in RGB-Skeleton Action Recognition: Leveraging Attention Modality Reweighter [32.64004722423187]
We show how to improve the robustness of RGB-skeleton action recognition models.
We propose the formatwordAttention-based formatwordModality formatwordReweighter (formatwordAMR)
Our AMR is plug-and-play, allowing easy integration with multimodal models.
arXiv Detail & Related papers (2024-07-29T13:15:51Z) - Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective [111.58315434849047]
robustness of neural information retrieval models (IR) models has garnered significant attention.
We view the robustness of IR to be a multifaceted concept, emphasizing its necessity against adversarial attacks, out-of-distribution (OOD) scenarios and performance variance.
We provide an in-depth discussion of existing methods, datasets, and evaluation metrics, shedding light on challenges and future directions in the era of large language models.
arXiv Detail & Related papers (2024-07-09T16:07:01Z) - AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion Models [7.406040859734522]
Unrestricted adversarial attacks present a serious threat to deep learning models and adversarial defense techniques.
Previous attack methods often directly inject Projected Gradient Descent (PGD) gradients into the sampling of generative models.
We propose a new method, called AdvDiff, to generate unrestricted adversarial examples with diffusion models.
arXiv Detail & Related papers (2023-07-24T03:10:02Z) - Multi-Expert Adversarial Attack Detection in Person Re-identification
Using Context Inconsistency [47.719533482898306]
We propose a Multi-Expert Adversarial Attack Detection (MEAAD) approach to detect malicious attacks on person re-identification (ReID) systems.
As the first adversarial attack detection approach for ReID,MEAADeffectively detects various adversarial at-tacks and achieves high ROC-AUC (over 97.5%).
arXiv Detail & Related papers (2021-08-23T01:59:09Z) - Towards Adversarial Patch Analysis and Certified Defense against Crowd
Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems.
Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks.
We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z) - Stealing Deep Reinforcement Learning Models for Fun and Profit [33.64948529132546]
This paper presents the first model extraction attack against Deep Reinforcement Learning (DRL)
It enables an external adversary to precisely recover a black-box DRL model only from its interaction with the environment.
arXiv Detail & Related papers (2020-06-09T03:24:35Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.