Related papers: Addressing The Devastating Effects Of Single-Task Data Poisoning In Exemplar-Free Continual Learning

Addressing The Devastating Effects Of Single-Task Data Poisoning In Exemplar-Free Continual Learning

URL: http://arxiv.org/abs/2507.04106v1
Date: Sat, 05 Jul 2025 17:26:52 GMT
Title: Addressing The Devastating Effects Of Single-Task Data Poisoning In Exemplar-Free Continual Learning
Authors: Stanisław Pawlak, Bartłomiej Twardowski, Tomasz Trzciński, Joost van de Weijer,
Abstract summary: Research addresses the overlooked security concerns related to data poisoning in continual learning (CL)<n>Data poisoning was recently shown to be a threat to CL training stability.<n>In contrast to previously proposed poisoning settings, adversaries lack knowledge and access to the model.
Score: 11.525308323843852
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Our research addresses the overlooked security concerns related to data poisoning in continual learning (CL). Data poisoning - the intentional manipulation of training data to affect the predictions of machine learning models - was recently shown to be a threat to CL training stability. While existing literature predominantly addresses scenario-dependent attacks, we propose to focus on a more simple and realistic single-task poison (STP) threats. In contrast to previously proposed poisoning settings, in STP adversaries lack knowledge and access to the model, as well as to both previous and future tasks. During an attack, they only have access to the current task within the data stream. Our study demonstrates that even within these stringent conditions, adversaries can compromise model performance using standard image corruptions. We show that STP attacks are able to strongly disrupt the whole continual training process: decreasing both the stability (its performance on past tasks) and plasticity (capacity to adapt to new tasks) of the algorithm. Finally, we propose a high-level defense framework for CL along with a poison task detection method based on task vectors. The code is available at https://github.com/stapaw/STP.git .

Related papers

Keeping up with dynamic attackers: Certifying robustness to adaptive online data poisoning [20.44830200702146]
The rise of foundation models fine-tuned on human feedback has increased the risk of adversarial data poisoning.<n>We present a novel framework for computing certified bounds on the impact of dynamic poisoning.<n>We use these certificates to design robust learning algorithms.
arXiv Detail & Related papers (2025-02-23T22:40:56Z)
Persistent Pre-Training Poisoning of LLMs [71.53046642099142]
Our work evaluates for the first time whether language models can also be compromised during pre-training. We pre-train a series of LLMs from scratch to measure the impact of a potential poisoning adversary. Our main result is that poisoning only 0.1% of a model's pre-training dataset is sufficient for three out of four attacks to persist through post-training.
arXiv Detail & Related papers (2024-10-17T16:27:13Z)
Indiscriminate Data Poisoning Attacks on Pre-trained Feature Extractors [26.36344184385407]
In this paper, we explore the threat of indiscriminate attacks on downstream tasks that apply pre-trained feature extractors. We propose two types of attacks: (1) the input space attacks, where we modify existing attacks to craft poisoned data in the input space; and (2) the feature targeted attacks, where we find poisoned features by treating the learned feature representations as a dataset. Our experiments examine such attacks in popular downstream tasks of fine-tuning on the same dataset and transfer learning that considers domain adaptation.
arXiv Detail & Related papers (2024-02-20T01:12:59Z)
Pre-trained Trojan Attacks for Visual Recognition [106.13792185398863]
Pre-trained vision models (PVMs) have become a dominant component due to their exceptional performance when fine-tuned for downstream tasks. We propose the Pre-trained Trojan attack, which embeds backdoors into a PVM, enabling attacks across various downstream vision tasks. We highlight the challenges posed by cross-task activation and shortcut connections in successful backdoor attacks.
arXiv Detail & Related papers (2023-12-23T05:51:40Z)
PACOL: Poisoning Attacks Against Continual Learners [1.569413950416037]
In this work, we demonstrate that continual learning systems can be manipulated by malicious misinformation. We present a new category of data poisoning attacks specific for continual learners, which we refer to as em Poisoning Attacks Against Continual learners (PACOL) A comprehensive set of experiments shows the vulnerability of commonly used generative replay and regularization-based continual learning approaches against attack methods.
arXiv Detail & Related papers (2023-11-18T00:20:57Z)
On the Exploitability of Instruction Tuning [103.8077787502381]
In this work, we investigate how an adversary can exploit instruction tuning to change a model's behavior. We propose textitAutoPoison, an automated data poisoning pipeline. Our results show that AutoPoison allows an adversary to change a model's behavior by poisoning only a small fraction of data.
arXiv Detail & Related papers (2023-06-28T17:54:04Z)
Data Poisoning Attack Aiming the Vulnerability of Continual Learning [25.480762565632332]
We present a simple task-specific data poisoning attack that can be used in the learning process of a new task. We experiment with the attack on the two representative regularization-based continual learning methods.
arXiv Detail & Related papers (2022-11-29T02:28:05Z)
Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects. Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z)
Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks. We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable. We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z)
How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality. We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers. Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z)
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data. We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label" We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.