Related papers: Resilience from Diversity: Population-based approach to harden models against adversarial attacks

Resilience from Diversity: Population-based approach to harden models against adversarial attacks

URL: http://arxiv.org/abs/2111.10272v1
Date: Fri, 19 Nov 2021 15:22:21 GMT
Title: Resilience from Diversity: Population-based approach to harden models against adversarial attacks
Authors: Jasser Jasser and Ivan Garibay
Abstract summary: This work introduces a model that is resilient to adversarial attacks. Our model leverages a well established principle from biological sciences: population diversity produces resilience against environmental changes. A Counter-Linked Model (CLM) consists of submodels of the same architecture where a periodic random similarity examination is conducted.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Traditional deep learning models exhibit intriguing vulnerabilities that allow an attacker to force them to fail at their task. Notorious attacks such as the Fast Gradient Sign Method (FGSM) and the more powerful Projected Gradient Descent (PGD) generate adversarial examples by adding a magnitude of perturbation $\epsilon$ to the input's computed gradient, resulting in a deterioration of the effectiveness of the model's classification. This work introduces a model that is resilient to adversarial attacks. Our model leverages a well established principle from biological sciences: population diversity produces resilience against environmental changes. More precisely, our model consists of a population of $n$ diverse submodels, each one of them trained to individually obtain a high accuracy for the task at hand, while forced to maintain meaningful differences in their weight tensors. Each time our model receives a classification query, it selects a submodel from its population at random to answer the query. To introduce and maintain diversity in population of submodels, we introduce the concept of counter linking weights. A Counter-Linked Model (CLM) consists of submodels of the same architecture where a periodic random similarity examination is conducted during the simultaneous training to guarantee diversity while maintaining accuracy. In our testing, CLM robustness got enhanced by around 20% when tested on the MNIST dataset and at least 15% when tested on the CIFAR-10 dataset. When implemented with adversarially trained submodels, this methodology achieves state-of-the-art robustness. On the MNIST dataset with $\epsilon=0.3$, it achieved 94.34% against FGSM and 91% against PGD. On the CIFAR-10 dataset with $\epsilon=8/255$, it achieved 62.97% against FGSM and 59.16% against PGD.

Related papers

Robustness of Selected Learning Models under Label-Flipping Attack [1.3812010983144798]
We compare traditional machine learning and deep learning models trained on a malware dataset when subjected to adversarial attack based on label-flipping. We find wide variation in the robustness of the models tested to adversarial attack, with our model achieving the best combination of initial accuracy and robustness.
arXiv Detail & Related papers (2025-01-21T22:00:54Z)
A Robust Adversarial Ensemble with Causal (Feature Interaction) Interpretations for Image Classification [9.945272787814941]
We present a deep ensemble model that combines discriminative features with generative models to achieve both high accuracy and adversarial robustness. Our approach integrates a bottom-level pre-trained discriminative network for feature extraction with a top-level generative classification network that models adversarial input distributions.
arXiv Detail & Related papers (2024-12-28T05:06:20Z)
MOREL: Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
deep neural networks (DNNs) are vulnerable to slight adversarial perturbations. We show that strong feature representation learning during training can significantly enhance the original model's robustness. We propose MOREL, a multi-objective feature representation learning approach, encouraging classification models to produce similar features for inputs within the same class, despite perturbations.
arXiv Detail & Related papers (2024-10-02T16:05:03Z)
Uncertainty-aware Sampling for Long-tailed Semi-supervised Learning [89.98353600316285]
We introduce uncertainty into the modeling process for pseudo-label sampling, taking into account that the model performance on the tailed classes varies over different training stages. This approach allows the model to perceive the uncertainty of pseudo-labels at different training stages, thereby adaptively adjusting the selection thresholds for different classes. Compared to other methods such as the baseline method FixMatch, UDTS achieves an increase in accuracy of at least approximately 5.26%, 1.75%, 9.96%, and 1.28% on the natural scene image datasets.
arXiv Detail & Related papers (2024-01-09T08:59:39Z)
Semantic Image Attack for Visual Model Diagnosis [80.36063332820568]
In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models. This paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images.
arXiv Detail & Related papers (2023-03-23T03:13:04Z)
Enhancing Targeted Attack Transferability via Diversified Weight Pruning [0.3222802562733786]
Malicious attackers can generate targeted adversarial examples by imposing human-imperceptible noise on images. With cross-model transferable adversarial examples, the vulnerability of neural networks remains even if the model information is kept secret from the attacker. Recent studies have shown the effectiveness of ensemble-based methods in generating transferable adversarial examples.
arXiv Detail & Related papers (2022-08-18T07:25:48Z)
Suppressing Poisoning Attacks on Federated Learning for Medical Imaging [4.433842217026879]
We propose a robust aggregation rule called Distance-based Outlier Suppression (DOS) that is resilient to byzantine failures. The proposed method computes the distance between local parameter updates of different clients and obtains an outlier score for each client. The resulting outlier scores are converted into normalized weights using a softmax function, and a weighted average of the local parameters is used for updating the global model.
arXiv Detail & Related papers (2022-07-15T00:43:34Z)
Mutual Adversarial Training: Learning together is better than going alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation. We propose mutual adversarial training (MAT) in which multiple models are trained together. MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z)
Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
Voting based ensemble improves robustness of defensive models [82.70303474487105]
We study whether it is possible to create an ensemble to further improve robustness. By ensembling several state-of-the-art pre-trained defense models, our method can achieve a 59.8% robust accuracy.
arXiv Detail & Related papers (2020-11-28T00:08:45Z)
Learnable Boundary Guided Adversarial Training [66.57846365425598]
We use the model logits from one clean model to guide learning of another one robust model. We achieve new state-of-the-art robustness on CIFAR-100 without additional real or synthetic data.
arXiv Detail & Related papers (2020-11-23T01:36:05Z)
DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles [20.46399318111058]
Adversarial attacks can mislead CNN models with small perturbations, which can effectively transfer between different models trained on the same dataset. We propose DVERGE, which isolates the adversarial vulnerability in each sub-model by distilling non-robust features. The novel diversity metric and training procedure enables DVERGE to achieve higher robustness against transfer attacks.
arXiv Detail & Related papers (2020-09-30T14:57:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.