Adversarial Attacks on Tables with Entity Swap
- URL: http://arxiv.org/abs/2309.08650v1
- Date: Fri, 15 Sep 2023 15:03:33 GMT
- Title: Adversarial Attacks on Tables with Entity Swap
- Authors: Aneta Koleva, Martin Ringsquandl and Volker Tresp
- Abstract summary: Adversarial attacks on text have been shown to greatly affect the performance of large language models (LLMs)
In this paper, we propose an evasive entity-swap attack for the column type annotation (CTA) task.
Our CTA attack is the first black-box attack on tables, where we employ a similarity-based sampling strategy to generate adversarial examples.
- Score: 22.32213001202441
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The capabilities of large language models (LLMs) have been successfully
applied in the context of table representation learning. The recently proposed
tabular language models have reported state-of-the-art results across various
tasks for table interpretation. However, a closer look into the datasets
commonly used for evaluation reveals an entity leakage from the train set into
the test set. Motivated by this observation, we explore adversarial attacks
that represent a more realistic inference setup. Adversarial attacks on text
have been shown to greatly affect the performance of LLMs, but currently, there
are no attacks targeting tabular language models. In this paper, we propose an
evasive entity-swap attack for the column type annotation (CTA) task. Our CTA
attack is the first black-box attack on tables, where we employ a
similarity-based sampling strategy to generate adversarial examples. The
experimental results show that the proposed attack generates up to a 70% drop
in performance.
Related papers
- A Realistic Threat Model for Large Language Model Jailbreaks [87.64278063236847]
In this work, we propose a unified threat model for the principled comparison of jailbreak attacks.
Our threat model combines constraints in perplexity, measuring how far a jailbreak deviates from natural text.
We adapt popular attacks to this new, realistic threat model, with which we, for the first time, benchmark these attacks on equal footing.
arXiv Detail & Related papers (2024-10-21T17:27:01Z) - MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks [62.34019142949628]
Typographic Attacks, which involve pasting misleading text onto an image, were noted to harm the performance of Vision-Language Models like CLIP.
We introduce two novel and more effective textitSelf-Generated attacks which prompt the LVLM to generate an attack against itself.
Using our benchmark, we uncover that Self-Generated attacks pose a significant threat, reducing LVLM(s) classification performance by up to 33%.
arXiv Detail & Related papers (2024-02-01T14:41:20Z) - DTA: Distribution Transform-based Attack for Query-Limited Scenario [11.874670564015789]
In generating adversarial examples, the conventional black-box attack methods rely on sufficient feedback from the to-be-attacked models.
This paper proposes a hard-label attack that simulates an attacked action being permitted to conduct a limited number of queries.
Experiments validate the effectiveness of the proposed idea and the superiority of DTA over the state-of-the-art.
arXiv Detail & Related papers (2023-12-12T13:21:03Z) - SA-Attack: Improving Adversarial Transferability of Vision-Language
Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios.
We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z) - DALA: A Distribution-Aware LoRA-Based Adversarial Attack against
Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data.
Recent attack methods can achieve a relatively high attack success rate (ASR)
We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z) - Fooling the Textual Fooler via Randomizing Latent Representations [13.77424820701913]
adversarial word-level perturbations are well-studied and effective attack strategies.
We propose a lightweight and attack-agnostic defense whose main goal is to perplex the process of generating an adversarial example.
We empirically demonstrate near state-of-the-art robustness of AdvFooler against representative adversarial word-level attacks.
arXiv Detail & Related papers (2023-10-02T06:57:25Z) - Defending Pre-trained Language Models as Few-shot Learners against
Backdoor Attacks [72.03945355787776]
We advocate MDP, a lightweight, pluggable, and effective defense for PLMs as few-shot learners.
We show analytically that MDP creates an interesting dilemma for the attacker to choose between attack effectiveness and detection evasiveness.
arXiv Detail & Related papers (2023-09-23T04:41:55Z) - COVER: A Heuristic Greedy Adversarial Attack on Prompt-based Learning in
Language Models [4.776465250559034]
We propose a prompt-based adversarial attack on manual templates in black box scenarios.
First of all, we design character-level and word-level approaches to break manual templates separately.
And we present a greedy algorithm for the attack based on the above destructive approaches.
arXiv Detail & Related papers (2023-06-09T03:53:42Z) - Towards Robustness of Text-to-SQL Models Against Natural and Realistic
Adversarial Table Perturbation [38.00832631674398]
We introduce the Adversarial Table Perturbation (ATP) as a new attacking paradigm to measure the robustness of Text-to-textual models.
We build a systematic adversarial training example generation framework for better contextualization of data.
Experiments show that our approach not only brings the best improvement against table-side perturbations but also substantially empowers models against NL-side perturbations.
arXiv Detail & Related papers (2022-12-20T04:38:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.