Related papers: Hiding Backdoors within Event Sequence Data via Poisoning Attacks

Hiding Backdoors within Event Sequence Data via Poisoning Attacks

URL: http://arxiv.org/abs/2308.10201v2
Date: Sun, 25 Aug 2024 16:47:57 GMT
Title: Hiding Backdoors within Event Sequence Data via Poisoning Attacks
Authors: Alina Ermilova, Elizaveta Kovtun, Dmitry Berestnev, Alexey Zaytsev,
Abstract summary: In computer vision, one can shape the output during inference by performing an adversarial attack called poisoning. For sequences of financial transactions of a customer, insertion of a backdoor is harder to perform. We replace a clean model with a poisoned one that is aware of the availability of a backdoor and utilize this knowledge.
Score: 2.532893215351299
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The financial industry relies on deep learning models for making important decisions. This adoption brings new danger, as deep black-box models are known to be vulnerable to adversarial attacks. In computer vision, one can shape the output during inference by performing an adversarial attack called poisoning via introducing a backdoor into the model during training. For sequences of financial transactions of a customer, insertion of a backdoor is harder to perform, as models operate over a more complex discrete space of sequences, and systematic checks for insecurities occur. We provide a method to introduce concealed backdoors, creating vulnerabilities without altering their functionality for uncontaminated data. To achieve this, we replace a clean model with a poisoned one that is aware of the availability of a backdoor and utilize this knowledge. Our most difficult for uncovering attacks include either additional supervised detection step of poisoned data activated during the test or well-hidden model weight modifications. The experimental study provides insights into how these effects vary across different datasets, architectures, and model components. Alternative methods and baselines, such as distillation-type regularization, are also explored but found to be less efficient. Conducted on three open transaction datasets and architectures, including LSTM, CNN, and Transformer, our findings not only illuminate the vulnerabilities in contemporary models but also can drive the construction of more robust systems.

Related papers

Kill it with FIRE: On Leveraging Latent Space Directions for Runtime Backdoor Mitigation in Deep Neural Networks [1.9517610560768623]
A well-known vulnerability is a backdoor introduced into a neural network by poisoned training data or a malicious training process.<n>We propose our inference-time backdoor mitigation approach called FIRE.<n>We view the trigger as directions in the latent spaces between layers that can be applied in reverse to correct the inference mechanism.
arXiv Detail & Related papers (2026-02-11T12:13:25Z)
Backdoor Unlearning by Linear Task Decomposition [69.91984435094157]
Foundation models are highly susceptible to adversarial perturbations and targeted backdoor attacks.<n>Existing backdoor removal approaches rely on costly fine-tuning to override the harmful behavior.<n>This raises the question of whether backdoors can be removed without compromising the general capabilities of the models.
arXiv Detail & Related papers (2025-10-16T16:18:07Z)
Architectural Backdoors for Within-Batch Data Stealing and Model Inference Manipulation [9.961238260113916]
We introduce a novel class of backdoors that builds upon recent advancements in architectural backdoors.<n>We show that such attacks are not only feasible but also alarmingly effective.<n>We propose a deterministic mitigation strategy that provides formal guarantees against this new attack vector.
arXiv Detail & Related papers (2025-05-23T19:28:45Z)
A Backdoor Attack Scheme with Invisible Triggers Based on Model Architecture Modification [12.393139669821869]
Traditional backdoor attacks involve injecting malicious samples with specific triggers into the training data. More sophisticated attacks modify the model's architecture directly. A novel backdoor attack method is presented in the paper. It embeds the backdoor within the model's architecture and has the capability to generate inconspicuous and stealthy triggers.
arXiv Detail & Related papers (2024-12-22T07:39:43Z)
Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models [68.40324627475499]
We introduce a novel two-step defense framework named Expose Before You Defend. EBYD unifies existing backdoor defense methods into a comprehensive defense system with enhanced performance. We conduct extensive experiments on 10 image attacks and 6 text attacks across 2 vision datasets and 4 language datasets.
arXiv Detail & Related papers (2024-10-25T09:36:04Z)
Mellivora Capensis: A Backdoor-Free Training Framework on the Poisoned Dataset without Auxiliary Data [29.842087372804905]
This paper addresses the challenges of backdoor attack countermeasures in real-world scenarios. We propose a robust and clean-data-free backdoor defense framework, namely Mellivora Capensis (textttMeCa), which enables the model trainer to train a clean model on the poisoned dataset.
arXiv Detail & Related papers (2024-05-21T12:20:19Z)
Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack. When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model. Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z)
Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks [63.269788236474234]
We propose to use model pairs on open-set classification tasks for detecting backdoors. We show that this score, can be an indicator for the presence of a backdoor despite models being of different architectures. This technique allows for the detection of backdoors on models designed for open-set classification tasks, which is little studied in the literature.
arXiv Detail & Related papers (2024-02-28T21:29:16Z)
Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data [26.551317580666353]
Backdoor attacks pose a serious security threat for training neural networks. We propose a novel approach that enables model training on potentially poisoned datasets by utilizing the power of recent diffusion models.
arXiv Detail & Related papers (2023-10-10T07:25:06Z)
Backdoor Learning on Sequence to Sequence Models [94.23904400441957]
In this paper, we study whether sequence-to-sequence (seq2seq) models are vulnerable to backdoor attacks. Specifically, we find by only injecting 0.2% samples of the dataset, we can cause the seq2seq model to generate the designated keyword and even the whole sentence. Extensive experiments on machine translation and text summarization have been conducted to show our proposed methods could achieve over 90% attack success rate on multiple datasets and models.
arXiv Detail & Related papers (2023-05-03T20:31:13Z)
DeepSight: Mitigating Backdoor Attacks in Federated Learning Through Deep Model Inspection [26.593268413299228]
Federated Learning (FL) allows multiple clients to collaboratively train a Neural Network (NN) model on their private data without revealing the data. DeepSight is a novel model filtering approach for mitigating backdoor attacks. We show that it can mitigate state-of-the-art backdoor attacks with a negligible impact on the model's performance on benign data.
arXiv Detail & Related papers (2022-01-03T17:10:07Z)
Check Your Other Door! Establishing Backdoor Attacks in the Frequency Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks. We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z)
Black-box Detection of Backdoor Attacks with Limited Information and Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model. In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)
Systematic Evaluation of Backdoor Data Poisoning Attacks on Image Classifiers [6.352532169433872]
Backdoor data poisoning attacks have been demonstrated in computer vision research as a potential safety risk for machine learning (ML) systems. Our work builds upon prior backdoor data-poisoning research for ML image classifiers. We find that poisoned models are hard to detect through performance inspection alone.
arXiv Detail & Related papers (2020-04-24T02:58:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.