Resilience from Diversity: Population-based approach to harden models
against adversarial attacks
- URL: http://arxiv.org/abs/2111.10272v1
- Date: Fri, 19 Nov 2021 15:22:21 GMT
- Title: Resilience from Diversity: Population-based approach to harden models
against adversarial attacks
- Authors: Jasser Jasser and Ivan Garibay
- Abstract summary: This work introduces a model that is resilient to adversarial attacks.
Our model leverages a well established principle from biological sciences: population diversity produces resilience against environmental changes.
A Counter-Linked Model (CLM) consists of submodels of the same architecture where a periodic random similarity examination is conducted.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional deep learning models exhibit intriguing vulnerabilities that
allow an attacker to force them to fail at their task. Notorious attacks such
as the Fast Gradient Sign Method (FGSM) and the more powerful Projected
Gradient Descent (PGD) generate adversarial examples by adding a magnitude of
perturbation $\epsilon$ to the input's computed gradient, resulting in a
deterioration of the effectiveness of the model's classification. This work
introduces a model that is resilient to adversarial attacks. Our model
leverages a well established principle from biological sciences: population
diversity produces resilience against environmental changes. More precisely,
our model consists of a population of $n$ diverse submodels, each one of them
trained to individually obtain a high accuracy for the task at hand, while
forced to maintain meaningful differences in their weight tensors. Each time
our model receives a classification query, it selects a submodel from its
population at random to answer the query. To introduce and maintain diversity
in population of submodels, we introduce the concept of counter linking
weights. A Counter-Linked Model (CLM) consists of submodels of the same
architecture where a periodic random similarity examination is conducted during
the simultaneous training to guarantee diversity while maintaining accuracy. In
our testing, CLM robustness got enhanced by around 20% when tested on the MNIST
dataset and at least 15% when tested on the CIFAR-10 dataset. When implemented
with adversarially trained submodels, this methodology achieves
state-of-the-art robustness. On the MNIST dataset with $\epsilon=0.3$, it
achieved 94.34% against FGSM and 91% against PGD. On the CIFAR-10 dataset with
$\epsilon=8/255$, it achieved 62.97% against FGSM and 59.16% against PGD.
Related papers
- MOREL: Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
deep neural networks (DNNs) are vulnerable to slight adversarial perturbations.
We show that strong feature representation learning during training can significantly enhance the original model's robustness.
We propose MOREL, a multi-objective feature representation learning approach, encouraging classification models to produce similar features for inputs within the same class, despite perturbations.
arXiv Detail & Related papers (2024-10-02T16:05:03Z) - Uncertainty-aware Sampling for Long-tailed Semi-supervised Learning [89.98353600316285]
We introduce uncertainty into the modeling process for pseudo-label sampling, taking into account that the model performance on the tailed classes varies over different training stages.
This approach allows the model to perceive the uncertainty of pseudo-labels at different training stages, thereby adaptively adjusting the selection thresholds for different classes.
Compared to other methods such as the baseline method FixMatch, UDTS achieves an increase in accuracy of at least approximately 5.26%, 1.75%, 9.96%, and 1.28% on the natural scene image datasets.
arXiv Detail & Related papers (2024-01-09T08:59:39Z) - Semantic Image Attack for Visual Model Diagnosis [80.36063332820568]
In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models.
This paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images.
arXiv Detail & Related papers (2023-03-23T03:13:04Z) - Enhancing Targeted Attack Transferability via Diversified Weight Pruning [0.3222802562733786]
Malicious attackers can generate targeted adversarial examples by imposing human-imperceptible noise on images.
With cross-model transferable adversarial examples, the vulnerability of neural networks remains even if the model information is kept secret from the attacker.
Recent studies have shown the effectiveness of ensemble-based methods in generating transferable adversarial examples.
arXiv Detail & Related papers (2022-08-18T07:25:48Z) - Suppressing Poisoning Attacks on Federated Learning for Medical Imaging [4.433842217026879]
We propose a robust aggregation rule called Distance-based Outlier Suppression (DOS) that is resilient to byzantine failures.
The proposed method computes the distance between local parameter updates of different clients and obtains an outlier score for each client.
The resulting outlier scores are converted into normalized weights using a softmax function, and a weighted average of the local parameters is used for updating the global model.
arXiv Detail & Related papers (2022-07-15T00:43:34Z) - Mutual Adversarial Training: Learning together is better than going
alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation.
We propose mutual adversarial training (MAT) in which multiple models are trained together.
MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Voting based ensemble improves robustness of defensive models [82.70303474487105]
We study whether it is possible to create an ensemble to further improve robustness.
By ensembling several state-of-the-art pre-trained defense models, our method can achieve a 59.8% robust accuracy.
arXiv Detail & Related papers (2020-11-28T00:08:45Z) - Learnable Boundary Guided Adversarial Training [66.57846365425598]
We use the model logits from one clean model to guide learning of another one robust model.
We achieve new state-of-the-art robustness on CIFAR-100 without additional real or synthetic data.
arXiv Detail & Related papers (2020-11-23T01:36:05Z) - DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of
Ensembles [20.46399318111058]
Adversarial attacks can mislead CNN models with small perturbations, which can effectively transfer between different models trained on the same dataset.
We propose DVERGE, which isolates the adversarial vulnerability in each sub-model by distilling non-robust features.
The novel diversity metric and training procedure enables DVERGE to achieve higher robustness against transfer attacks.
arXiv Detail & Related papers (2020-09-30T14:57:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.