Related papers: Adversary Aware Continual Learning

Adversary Aware Continual Learning

URL: http://arxiv.org/abs/2304.14483v1
Date: Thu, 27 Apr 2023 19:49:50 GMT
Title: Adversary Aware Continual Learning
Authors: Muhammad Umer and Robi Polikar
Abstract summary: Adversary can introduce small amount of misinformation to the model to cause deliberate forgetting of a specific task or class at test time. We use the attacker's primary strength-hiding the backdoor pattern by making it imperceptible to humans-against it, and propose to learn a perceptible (stronger) pattern that can overpower the attacker's imperceptible pattern. We show that our proposed defensive framework considerably improves the performance of class incremental learning algorithms with no knowledge of the attacker's target task, attacker's target class, and attacker's imperceptible pattern.
Score: 3.3439097577935213
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Class incremental learning approaches are useful as they help the model to learn new information (classes) sequentially, while also retaining the previously acquired information (classes). However, it has been shown that such approaches are extremely vulnerable to the adversarial backdoor attacks, where an intelligent adversary can introduce small amount of misinformation to the model in the form of imperceptible backdoor pattern during training to cause deliberate forgetting of a specific task or class at test time. In this work, we propose a novel defensive framework to counter such an insidious attack where, we use the attacker's primary strength-hiding the backdoor pattern by making it imperceptible to humans-against it, and propose to learn a perceptible (stronger) pattern (also during the training) that can overpower the attacker's imperceptible (weaker) pattern. We demonstrate the effectiveness of the proposed defensive mechanism through various commonly used Replay-based (both generative and exact replay-based) class incremental learning algorithms using continual learning benchmark variants of CIFAR-10, CIFAR-100, and MNIST datasets. Most noteworthy, our proposed defensive framework does not assume that the attacker's target task and target class is known to the defender. The defender is also unaware of the shape, size, and location of the attacker's pattern. We show that our proposed defensive framework considerably improves the performance of class incremental learning algorithms with no knowledge of the attacker's target task, attacker's target class, and attacker's imperceptible pattern. We term our defensive framework as Adversary Aware Continual Learning (AACL).

Related papers

When Forgetting Triggers Backdoors: A Clean Unlearning Attack [1.8434042562191815]
We propose a novel em clean backdoor attack that exploits both the model learning phase and the subsequent unlearning requests.<n>This strategy results in a powerful and stealthy novel attack that is hard to detect or mitigate.
arXiv Detail & Related papers (2025-06-14T14:31:51Z)
Robust Anti-Backdoor Instruction Tuning in LVLMs [53.766434746801366]
We introduce a lightweight, certified-agnostic defense framework for large visual language models (LVLMs)<n>Our framework finetunes only adapter modules and text embedding layers under instruction tuning.<n>Experiments against seven attacks on Flickr30k and MSCOCO demonstrate that ours reduces their attack success rate to nearly zero.
arXiv Detail & Related papers (2025-06-04T01:23:35Z)
Mind the Gap: Detecting Black-box Adversarial Attacks in the Making through Query Update Analysis [3.795071937009966]
Adrial attacks can jeopardize the integrity of Machine Learning (ML) models. We propose a framework that detects if an adversarial noise instance is being generated. We evaluate our approach against 8 state-of-the-art attacks, including adaptive attacks.
arXiv Detail & Related papers (2025-03-04T20:25:12Z)
Sustainable Self-evolution Adversarial Training [51.25767996364584]
We propose a Sustainable Self-Evolution Adversarial Training (SSEAT) framework for adversarial training defense models. We introduce a continual adversarial defense pipeline to realize learning from various kinds of adversarial examples. We also propose an adversarial data replay module to better select more diverse and key relearning data.
arXiv Detail & Related papers (2024-12-03T08:41:11Z)
Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features. backdoor attacks subtly embed malicious behaviors within the model during training. We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z)
On the Difficulty of Defending Contrastive Learning against Backdoor Attacks [58.824074124014224]
We show how contrastive backdoor attacks operate through distinctive mechanisms. Our findings highlight the need for defenses tailored to the specificities of contrastive backdoor attacks.
arXiv Detail & Related papers (2023-12-14T15:54:52Z)
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses. We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z)
Learn from the Past: A Proxy Guided Adversarial Defense Framework with Self Distillation Regularization [53.04697800214848]
Adversarial Training (AT) is pivotal in fortifying the robustness of deep learning models. AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting. We present a general proxy guided defense framework, LAST' (bf Learn from the Pbf ast)
arXiv Detail & Related papers (2023-10-19T13:13:41Z)
Learning to Backdoor Federated Learning [9.046972927978997]
In a federated learning (FL) system, malicious participants can easily embed backdoors into the aggregated model. We propose a general reinforcement learning-based backdoor attack framework. Our framework is both adaptive and flexible and achieves strong attack performance and durability even under state-of-the-art defenses.
arXiv Detail & Related papers (2023-03-06T17:47:04Z)
Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class [17.391987602738606]
In recent years, machine learning models have been shown to be vulnerable to backdoor attacks. This paper exploits a novel backdoor attack with a much more powerful payload, denoted as Marksman. We show empirically that the proposed framework achieves high attack performance while preserving the clean-data performance in several benchmark datasets.
arXiv Detail & Related papers (2022-10-17T15:46:57Z)
Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the Age of AI-NIDS [70.60975663021952]
We study blackbox adversarial attacks on network classifiers. We argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions. We show that a continual learning approach is required to study attacker-defender dynamics.
arXiv Detail & Related papers (2021-11-23T23:42:16Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
Sparse Coding Frontend for Robust Neural Networks [11.36192454455449]
Deep Neural Networks are known to be vulnerable to small, adversarially crafted, perturbations. Current defense methods against these adversarial attacks are variants of adversarial training. In this paper, we introduce a radically different defense based on a sparse coding based on clean images.
arXiv Detail & Related papers (2021-04-12T11:14:32Z)
Untargeted, Targeted and Universal Adversarial Attacks and Defenses on Time Series [0.0]
We have performed untargeted, targeted and universal adversarial attacks on UCR time series datasets. Our results show that deep learning based time series classification models are vulnerable to these attacks. We also show that universal adversarial attacks have good generalization property as it need only a fraction of the training data.
arXiv Detail & Related papers (2021-01-13T13:00:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.