Adversarial Attacks on Knowledge Graph Embeddings via Instance
Attribution Methods
- URL: http://arxiv.org/abs/2111.03120v1
- Date: Thu, 4 Nov 2021 19:38:48 GMT
- Title: Adversarial Attacks on Knowledge Graph Embeddings via Instance
Attribution Methods
- Authors: Peru Bhardwaj, John Kelleher, Luca Costabello and Declan O'Sullivan
- Abstract summary: We study data poisoning attacks against Knowledge Graph Embeddings (KGE) models for link prediction.
These attacks craft adversarial additions or deletions at training time to cause model failure at test time.
We propose a method to replace one of the two entities in each influential triple to generate adversarial additions.
- Score: 8.793721044482613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the widespread use of Knowledge Graph Embeddings (KGE), little is
known about the security vulnerabilities that might disrupt their intended
behaviour. We study data poisoning attacks against KGE models for link
prediction. These attacks craft adversarial additions or deletions at training
time to cause model failure at test time. To select adversarial deletions, we
propose to use the model-agnostic instance attribution methods from
Interpretable Machine Learning, which identify the training instances that are
most influential to a neural model's predictions on test instances. We use
these influential triples as adversarial deletions. We further propose a
heuristic method to replace one of the two entities in each influential triple
to generate adversarial additions. Our experiments show that the proposed
strategies outperform the state-of-art data poisoning attacks on KGE models and
improve the MRR degradation due to the attacks by up to 62% over the baselines.
Related papers
- Untargeted Adversarial Attack on Knowledge Graph Embeddings [18.715565468700227]
Knowledge graph embedding (KGE) methods have achieved great success in handling various knowledge graph (KG) downstream tasks.
Some recent studies propose adversarial attacks to investigate the vulnerabilities of KGE methods, but their attackers are target-oriented with the KGE method.
In this work, we explore untargeted attacks with the aim of reducing the global performances of KGE methods over a set of unknown test triples.
arXiv Detail & Related papers (2024-05-08T18:08:11Z) - Defense Against Model Extraction Attacks on Recommender Systems [53.127820987326295]
We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems.
GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model.
Results show GRO's superior effectiveness in defending against model extraction attacks.
arXiv Detail & Related papers (2023-10-25T03:30:42Z) - OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable
Evasion Attacks [17.584752814352502]
Evasion Attacks (EA) are used to test the robustness of trained neural networks by distorting input data.
We introduce a self-supervised, computationally economical method for generating adversarial examples.
Our experiments consistently demonstrate the method is effective across various models, unseen data categories, and even defended models.
arXiv Detail & Related papers (2023-10-05T17:34:47Z) - Semantic Image Attack for Visual Model Diagnosis [80.36063332820568]
In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models.
This paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images.
arXiv Detail & Related papers (2023-03-23T03:13:04Z) - The Space of Adversarial Strategies [6.295859509997257]
Adversarial examples, inputs designed to induce worst-case behavior in machine learning models, have been extensively studied over the past decade.
We propose a systematic approach to characterize worst-case (i.e., optimal) adversaries.
arXiv Detail & Related papers (2022-09-09T20:53:11Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Tolerating Adversarial Attacks and Byzantine Faults in Distributed
Machine Learning [12.464625883462515]
Adversarial attacks attempt to disrupt the training, retraining and utilizing of artificial intelligence and machine learning models.
We propose a novel distributed training algorithm, partial synchronous gradient descent (ParSGD), which defends adversarial attacks and/or tolerates Byzantine faults.
Our results show that using ParSGD, ML models can still produce accurate predictions as if it is not being attacked nor having failures at all when almost half of the nodes are being compromised or failed.
arXiv Detail & Related papers (2021-09-05T07:55:02Z) - AGKD-BML: Defense Against Adversarial Attack by Attention Guided
Knowledge Distillation and Bi-directional Metric Learning [61.8003954296545]
We propose a novel adversarial training-based model by Attention Guided Knowledge Distillation and Bi-directional Metric Learning (AGKD-BML)
Our proposed AGKD-BML model consistently outperforms the state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-13T01:25:04Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Asymptotic Behavior of Adversarial Training in Binary Classification [41.7567932118769]
Adversarial training is considered to be the state-of-the-art method for defense against adversarial attacks.
Despite being successful in practice, several problems in understanding performance of adversarial training remain open.
We derive precise theoretical predictions for the minimization of adversarial training in binary classification.
arXiv Detail & Related papers (2020-10-26T01:44:20Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.