Parameter Interpolation Adversarial Training for Robust Image Classification
- URL: http://arxiv.org/abs/2511.00836v1
- Date: Sun, 02 Nov 2025 07:37:06 GMT
- Title: Parameter Interpolation Adversarial Training for Robust Image Classification
- Authors: Xin Liu, Yichen Yang, Kun He, John E. Hopcroft,
- Abstract summary: We propose a novel framework called Interpolation Adversarial Training (PIAT)<n> PIAT tunes the model parameters between each epoch by interpolating the parameters of the previous and current epochs.<n>It makes the decision boundary of model change more moderate and alleviates the overfitting issue.
- Score: 14.913267679379308
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Though deep neural networks exhibit superior performance on various tasks, they are still plagued by adversarial examples. Adversarial training has been demonstrated to be the most effective method to defend against adversarial attacks. However, existing adversarial training methods show that the model robustness has apparent oscillations and overfitting issues in the training process, degrading the defense efficacy. To address these issues, we propose a novel framework called Parameter Interpolation Adversarial Training (PIAT). PIAT tunes the model parameters between each epoch by interpolating the parameters of the previous and current epochs. It makes the decision boundary of model change more moderate and alleviates the overfitting issue, helping the model converge better and achieving higher model robustness. In addition, we suggest using the Normalized Mean Square Error (NMSE) to further improve the robustness by aligning the relative magnitude of logits between clean and adversarial examples rather than the absolute magnitude. Extensive experiments conducted on several benchmark datasets demonstrate that our framework could prominently improve the robustness of both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs).
Related papers
- A Robust Adversarial Ensemble with Causal (Feature Interaction) Interpretations for Image Classification [9.945272787814941]
We present a deep ensemble model that combines discriminative features with generative models to achieve both high accuracy and adversarial robustness.<n>Our approach integrates a bottom-level pre-trained discriminative network for feature extraction with a top-level generative classification network that models adversarial input distributions.
arXiv Detail & Related papers (2024-12-28T05:06:20Z) - Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network)<n>After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference.<n>We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z) - Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
Deep neural networks (DNNs) are vulnerable to small adversarial perturbations.<n>We show that robust feature learning during training can significantly enhance robustness.<n>We propose MOREL, a multi-objective approach that aligns natural and adversarial features.
arXiv Detail & Related papers (2024-10-02T16:05:03Z) - Robustness-Congruent Adversarial Training for Secure Machine Learning Model Updates [13.247291916609118]
We show that a newly updated model may commit mistakes the previous model did not make.<n>negative flips, experienced by users as a regression of performance.<n>In particular, when updating a model to improve its adversarial robustness, previously ineffective adversarial attacks on some inputs may become successful.<n>We propose a novel technique, named robustness-congruent adversarial training, to address this issue.
arXiv Detail & Related papers (2024-02-27T10:37:13Z) - Learn from the Past: A Proxy Guided Adversarial Defense Framework with
Self Distillation Regularization [53.04697800214848]
Adversarial Training (AT) is pivotal in fortifying the robustness of deep learning models.
AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting.
We present a general proxy guided defense framework, LAST' (bf Learn from the Pbf ast)
arXiv Detail & Related papers (2023-10-19T13:13:41Z) - Robust Feature Inference: A Test-time Defense Strategy using Spectral Projections [12.807619042576018]
We propose a novel test-time defense strategy called Robust Feature Inference (RFI)
RFI is easy to integrate with any existing (robust) training procedure without additional test-time computation.
We show that RFI improves robustness across adaptive and transfer attacks consistently.
arXiv Detail & Related papers (2023-07-21T16:18:58Z) - Improving Transferability of Adversarial Examples via Bayesian Attacks [68.90574788107442]
adversarial examples allows for the attack on unknown deep neural networks (DNNs)<n>In this paper, we improve the transferability of adversarial examples by incorporating the Bayesian formulation into both the model parameters and model input.<n>Experiments demonstrate that our method achieves a new state-of-the-art in transfer-based attacks.
arXiv Detail & Related papers (2023-07-21T03:43:07Z) - PIAT: Parameter Interpolation based Adversarial Training for Image
Classification [19.276850361815953]
We propose a novel framework, termed Interpolation based Adversarial Training (PIAT), that makes full use of the historical information during training.
Our framework is general and could further boost the robust accuracy when combined with other adversarial training methods.
arXiv Detail & Related papers (2023-03-24T12:22:34Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Self-Progressing Robust Training [146.8337017922058]
Current robust training methods such as adversarial training explicitly uses an "attack" to generate adversarial examples.
We propose a new framework called SPROUT, self-progressing robust training.
Our results shed new light on scalable, effective and attack-independent robust training methods.
arXiv Detail & Related papers (2020-12-22T00:45:24Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.