Related papers: Enhancing the "Immunity" of Mixture-of-Experts Networks for Adversarial Defense

Enhancing the "Immunity" of Mixture-of-Experts Networks for Adversarial Defense

URL: http://arxiv.org/abs/2402.18787v1
Date: Thu, 29 Feb 2024 01:27:38 GMT
Title: Enhancing the "Immunity" of Mixture-of-Experts Networks for Adversarial Defense
Authors: Qiao Han, yong huang, xinling Guo, Yiteng Zhai, Yu Qin and Yao Yang
Abstract summary: Recent studies have revealed the vulnerability of Deep Neural Networks (DNNs) to adversarial examples. We propose a novel adversarial defense method called "Immunity" based on a modified Mixture-of-Experts (MoE) architecture.
Score: 6.3712912872409415
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Recent studies have revealed the vulnerability of Deep Neural Networks (DNNs) to adversarial examples, which can easily fool DNNs into making incorrect predictions. To mitigate this deficiency, we propose a novel adversarial defense method called "Immunity" (Innovative MoE with MUtual information \& positioN stabilITY) based on a modified Mixture-of-Experts (MoE) architecture in this work. The key enhancements to the standard MoE are two-fold: 1) integrating of Random Switch Gates (RSGs) to obtain diverse network structures via random permutation of RSG parameters at evaluation time, despite of RSGs being determined after one-time training; 2) devising innovative Mutual Information (MI)-based and Position Stability-based loss functions by capitalizing on Grad-CAM's explanatory power to increase the diversity and the causality of expert networks. Notably, our MI-based loss operates directly on the heatmaps, thereby inducing subtler negative impacts on the classification performance when compared to other losses of the same type, theoretically. Extensive evaluation validates the efficacy of the proposed approach in improving adversarial robustness against a wide range of attacks.

Related papers

SafeGenes: Evaluating the Adversarial Robustness of Genomic Foundation Models [8.019763193322298]
We propose SafeGenes: a framework for Secure analysis of genomic foundation models.<n>We assess the adversarial vulnerabilities of GFMs using two approaches: the Fast Gradient Sign Method and a soft prompt attack.<n>Targeted soft prompt attacks led to substantial performance degradation, even in large models such as ESM1b and ESM1v.
arXiv Detail & Related papers (2025-06-01T03:54:03Z)
Do Spikes Protect Privacy? Investigating Black-Box Model Inversion Attacks in Spiking Neural Networks [0.0]
This work presents the first study of black-box Model Inversion (MI) attacks on Spiking Neural Networks (SNNs) We adapt a generative adversarial MI framework to the spiking domain by incorporating rate-based encoding for input transformation and decoding mechanisms for output interpretation. Our results show that SNNs exhibit significantly greater resistance to MI attacks than ANNs, as demonstrated by degraded reconstructions, increased instability in attack convergence, and overall reduced attack effectiveness across multiple evaluation metrics.
arXiv Detail & Related papers (2025-02-08T10:02:27Z)
CALoR: Towards Comprehensive Model Inversion Defense [43.2642796582236]
Model Inversion Attacks (MIAs) aim at recovering privacy-sensitive training data from the knowledge encoded in released machine learning models. Recent advances in the MIA field have significantly enhanced the attack performance under multiple scenarios. We propose a robust defense mechanism, integrating Confidence Adaptation and Low-Rank compression.
arXiv Detail & Related papers (2024-10-08T08:44:01Z)
Beyond Dropout: Robust Convolutional Neural Networks Based on Local Feature Masking [6.189613073024831]
This study introduces an innovative Local Feature Masking (LFM) strategy aimed at fortifying the performance of Convolutional Neural Networks (CNNs) During the training phase, we strategically incorporate random feature masking in the shallow layers of CNNs. LFM compels the network to adapt by leveraging remaining features to compensate for the absence of certain semantic features.
arXiv Detail & Related papers (2024-07-18T16:25:16Z)
The Effectiveness of Random Forgetting for Robust Generalization [21.163070161951868]
We introduce a novel learning paradigm called "Forget to Mitigate Overfitting" (FOMO) FOMO alternates between the forgetting phase, which randomly forgets a subset of weights, and the relearning phase, which emphasizes learning generalizable features. Our experiments show that FOMO alleviates robust overfitting by significantly reducing the gap between the best and last robust test accuracy.
arXiv Detail & Related papers (2024-02-18T23:14:40Z)
Perturbation-Invariant Adversarial Training for Neural Ranking Models: Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents. This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs. In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z)
Enhancing Adversarial Robustness via Score-Based Optimization [22.87882885963586]
Adversarial attacks have the potential to mislead deep neural network classifiers by introducing slight perturbations. We introduce a novel adversarial defense scheme named ScoreOpt, which optimize adversarial samples at test-time. Our experimental results demonstrate that our approach outperforms existing adversarial defenses in terms of both performance and robustness speed.
arXiv Detail & Related papers (2023-07-10T03:59:42Z)
Improving Adversarial Robustness to Sensitivity and Invariance Attacks with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample. We use metric learning to frame adversarial regularization as an optimal transport problem. Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z)
Boosting Adversarial Robustness From The Perspective of Effective Margin Regularization [58.641705224371876]
The adversarial vulnerability of deep neural networks (DNNs) has been actively investigated in the past several years. This paper investigates the scale-variant property of cross-entropy loss, which is the most commonly used loss function in classification tasks. We show that the proposed effective margin regularization (EMR) learns large effective margins and boosts the adversarial robustness in both standard and adversarial training.
arXiv Detail & Related papers (2022-10-11T03:16:56Z)
Mixture GAN For Modulation Classification Resiliency Against Adversarial Attacks [55.92475932732775]
We propose a novel generative adversarial network (GAN)-based countermeasure approach. GAN-based aims to eliminate the adversarial attack examples before feeding to the DNN-based classifier. Simulation results show the effectiveness of our proposed defense GAN so that it could enhance the accuracy of the DNN-based AMC under adversarial attacks to 81%, approximately.
arXiv Detail & Related papers (2022-05-29T22:30:32Z)
Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance. Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z)
Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks. This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network. Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z)
Combating the Instability of Mutual Information-based Losses via Regularization [7.424262881242935]
We first identify the symptoms behind their instability. We mitigate both issues by adding a novel regularization term to the existing losses. We present a novel benchmark that evaluates MI-based losses on both the MI estimation power and its capability on the downstream tasks.
arXiv Detail & Related papers (2020-11-16T13:29:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.