SHIELD: Secure Hypernetworks for Incremental Expansion Learning Defense
- URL: http://arxiv.org/abs/2506.08255v1
- Date: Mon, 09 Jun 2025 21:43:56 GMT
- Title: SHIELD: Secure Hypernetworks for Incremental Expansion Learning Defense
- Authors: Patryk Krukowski, Łukasz Gorczyca, Piotr Helm, Kamil Książek, Przemysław Spurek,
- Abstract summary: We introduce SHIELD, a novel approach that integrates a hypernetwork-based continual learning approach with interval arithmetic.<n>Our approach enhances security without sacrificing network adaptability, addressing the overlooked challenge of safety in continual learning.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional deep neural networks suffer from several limitations, including catastrophic forgetting. When models are adapted to new datasets, they tend to quickly forget previously learned knowledge. Another significant issue is the lack of robustness to even small perturbations in the input data. In practice, we can often easily perform adversarial attacks and change the network's predictions, adding minimal noise to the input. Dedicated architectures and training procedures can solve each of the above problems separately. Unfortunately, currently, no model can simultaneously address both catastrophic forgetting and vulnerability to adversarial attacks. We introduce SHIELD (Secure Hypernetworks for Incremental Expansion and Learning Defense), a novel approach that integrates a hypernetwork-based continual learning approach with interval arithmetic. SHIELD use the hypernetwork to transfer trainable task embedding vectors into the weights of a target model dedicated to specific data. This paradigm allows for the dynamic generation of separate networks for each subtask, while the hypernetwork aggregates and analyzes information across all tasks. The target model takes in the input a data sample with a defined interval range, and by creating a hypercube, produces a prediction for the given range. Therefore, such target models provide strict guarantees against all possible attacks for data samples within the interval range. Our approach enhances security without sacrificing network adaptability, addressing the overlooked challenge of safety in continual learning.
Related papers
- MINGLE: Mixtures of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging [19.916880222546155]
Continual model merging integrates independently fine-tuned models sequentially without access to original training data.<n> MINGLE is a novel framework for test-time continual model merging using a small set of unlabeled test samples.<n> MINGLE consistently surpasses previous state-of-the-art methods by 7-9% on average across diverse task orders.
arXiv Detail & Related papers (2025-05-17T07:24:22Z) - Concealed Adversarial attacks on neural networks for sequential data [2.1879059908547482]
We develop a concealed adversarial attack for different time-series models.<n>It provides more realistic perturbations, being hard to detect by a human or model discriminator.<n>Our findings highlight the growing challenge of designing robust time series models.
arXiv Detail & Related papers (2025-02-28T11:03:32Z) - NetFlowGen: Leveraging Generative Pre-training for Network Traffic Dynamics [72.95483148058378]
We propose to pre-train a general-purpose machine learning model to capture traffic dynamics with only traffic data from NetFlow records.<n>We address challenges such as unifying network feature representations, learning from large unlabeled traffic data volume, and testing on real downstream tasks in DDoS attack detection.
arXiv Detail & Related papers (2024-12-30T00:47:49Z) - Adversarial Robustness through Dynamic Ensemble Learning [0.0]
Adversarial attacks pose a significant threat to the reliability of pre-trained language models (PLMs)<n>This paper presents Adversarial Robustness through Dynamic Ensemble Learning (ARDEL), a novel scheme designed to enhance the robustness of PLMs against such attacks.
arXiv Detail & Related papers (2024-12-20T05:36:19Z) - BiCert: A Bilinear Mixed Integer Programming Formulation for Precise Certified Bounds Against Data Poisoning Attacks [62.897993591443594]
Data poisoning attacks pose one of the biggest threats to modern AI systems.<n>Data poisoning attacks pose one of the biggest threats to modern AI systems.<n>Data poisoning attacks pose one of the biggest threats to modern AI systems.
arXiv Detail & Related papers (2024-12-13T14:56:39Z) - Learning to Forget using Hypernetworks [5.5779348065867085]
HyperForget is a machine unlearning framework that samples models that lack knowledge of targeted data.<n>The unlearned models obtained zero accuracy on the forget set, while preserving good accuracy on the retain sets.
arXiv Detail & Related papers (2024-12-01T10:43:11Z) - Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails.
We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses.
C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z) - Maintaining Adversarial Robustness in Continuous Learning [10.746120682014437]
Adversarial robustness enhanced by defense algorithms is easily erased as the neural network's weights update to learn new tasks.
We propose a novel gradient projection technique that effectively stabilizes sample gradients from previous data.
This technique can maintaining robustness by collaborating with a class of defense algorithms through sample gradient smoothing.
arXiv Detail & Related papers (2024-02-17T05:14:47Z) - Towards Continual Learning Desiderata via HSIC-Bottleneck
Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion.
Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z) - Adaptive Domain Inference Attack with Concept Hierarchy [4.772368796656325]
Most known model-targeted attacks assume attackers have learned the application domain or training data distribution.<n>Can removing the domain information from model APIs protect models from these attacks?<n>We show that the proposed adaptive domain inference attack (ADI) can still successfully estimate relevant subsets of training data.
arXiv Detail & Related papers (2023-12-22T22:04:13Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - What Can We Learn from Unlearnable Datasets? [107.12337511216228]
Unlearnable datasets have the potential to protect data privacy by preventing deep neural networks from generalizing.
It is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization.
In contrast, we find that networks actually can learn useful features that can be reweighed for high test performance, suggesting that image protection is not assured.
arXiv Detail & Related papers (2023-05-30T17:41:35Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Improving Hyperspectral Adversarial Robustness Under Multiple Attacks [2.741266294612776]
We propose an Adversarial Discriminator Ensemble Network (ADE-Net) to combat this issue.
In the proposed method, a discriminator network is used to separate data by attack type into their specific attack-expert ensemble network.
arXiv Detail & Related papers (2022-10-28T18:21:45Z) - Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant.
One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning.
Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks.
We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Ensemble-in-One: Learning Ensemble within Random Gated Networks for
Enhanced Adversarial Robustness [18.514706498043214]
Adversarial attacks have rendered high security risks on modern deep learning systems.
We propose ensemble-in-one (EIO) to train an ensemble within one random gated network (RGN)
EIO consistently outperforms previous ensemble training methods with even less computational overhead.
arXiv Detail & Related papers (2021-03-27T03:13:03Z) - Self-Progressing Robust Training [146.8337017922058]
Current robust training methods such as adversarial training explicitly uses an "attack" to generate adversarial examples.
We propose a new framework called SPROUT, self-progressing robust training.
Our results shed new light on scalable, effective and attack-independent robust training methods.
arXiv Detail & Related papers (2020-12-22T00:45:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.