Improving Adversarial Transferability via Model Alignment
- URL: http://arxiv.org/abs/2311.18495v2
- Date: Wed, 17 Jul 2024 11:45:09 GMT
- Title: Improving Adversarial Transferability via Model Alignment
- Authors: Avery Ma, Amir-massoud Farahmand, Yangchen Pan, Philip Torr, Jindong Gu,
- Abstract summary: We introduce a novel model alignment technique aimed at improving a given source model's ability in generating transferable adversarial perturbations.
Experiments on the ImageNet dataset, using a variety of model architectures, demonstrate that perturbations generated from aligned source models exhibit significantly higher transferability.
- Score: 25.43899674478279
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks are susceptible to adversarial perturbations that are transferable across different models. In this paper, we introduce a novel model alignment technique aimed at improving a given source model's ability in generating transferable adversarial perturbations. During the alignment process, the parameters of the source model are fine-tuned to minimize an alignment loss. This loss measures the divergence in the predictions between the source model and another, independently trained model, referred to as the witness model. To understand the effect of model alignment, we conduct a geometric analysis of the resulting changes in the loss landscape. Extensive experiments on the ImageNet dataset, using a variety of model architectures, demonstrate that perturbations generated from aligned source models exhibit significantly higher transferability than those from the original source model.
Related papers
- SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Interpretable Water Level Forecaster with Spatiotemporal Causal
Attention Mechanisms [0.0]
This work proposes a neuraltemporal model with a transformer exploiting a causal relationship based on prior knowledge.
We use the Han River dataset from 2016 to compare 2021, and confirm that our model provides an interpretable and consistent model with prior knowledge.
arXiv Detail & Related papers (2023-02-28T04:37:26Z) - Mechanistic Mode Connectivity [11.772935238948662]
We study neural network loss landscapes through the lens of mode connectivity.
We ask the question: are minimizers that rely on different mechanisms for making their predictions connected via simple paths of low loss?
arXiv Detail & Related papers (2022-11-15T18:58:28Z) - Variational Model Perturbation for Source-Free Domain Adaptation [64.98560348412518]
We introduce perturbations into the model parameters by variational Bayesian inference in a probabilistic framework.
We demonstrate the theoretical connection to learning Bayesian neural networks, which proves the generalizability of the perturbed model to target domains.
arXiv Detail & Related papers (2022-10-19T08:41:19Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - De-Biasing Generative Models using Counterfactual Methods [0.0]
We propose a new decoder based framework named the Causal Counterfactual Generative Model (CCGM)
Our proposed method combines a causal latent space VAE model with specific modification to emphasize causal fidelity.
We explore how better disentanglement of causal learning and encoding/decoding generates higher causal intervention quality.
arXiv Detail & Related papers (2022-07-04T16:53:20Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Bidirectional Model-based Policy Optimization [30.732572976324516]
Model-based reinforcement learning approaches leverage a forward dynamics model to support planning and decision making.
In this paper, we propose to additionally construct a backward dynamics model to reduce the reliance on accuracy in forward model predictions.
We develop a novel method, called Bidirectional Model-based Policy (BMPO), to utilize both the forward model and backward model to generate short branched rollouts for policy optimization.
arXiv Detail & Related papers (2020-07-04T03:34:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.