Related papers: Preventing Catastrophic Overfitting in Fast Adversarial Training: A Bi-level Optimization Perspective

Preventing Catastrophic Overfitting in Fast Adversarial Training: A Bi-level Optimization Perspective

URL: http://arxiv.org/abs/2407.12443v1
Date: Wed, 17 Jul 2024 09:53:20 GMT
Title: Preventing Catastrophic Overfitting in Fast Adversarial Training: A Bi-level Optimization Perspective
Authors: Zhaoxin Wang, Handing Wang, Cong Tian, Yaochu Jin,
Abstract summary: Adversarial training (AT) has become an effective defense method against adversarial examples (AEs) Fast AT (FAT) employs a single-step attack strategy to guide the training process. FAT methods suffer from the catastrophic overfitting problem.
Score: 20.99874786089634
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial training (AT) has become an effective defense method against adversarial examples (AEs) and it is typically framed as a bi-level optimization problem. Among various AT methods, fast AT (FAT), which employs a single-step attack strategy to guide the training process, can achieve good robustness against adversarial attacks at a low cost. However, FAT methods suffer from the catastrophic overfitting problem, especially on complex tasks or with large-parameter models. In this work, we propose a FAT method termed FGSM-PCO, which mitigates catastrophic overfitting by averting the collapse of the inner optimization problem in the bi-level optimization process. FGSM-PCO generates current-stage AEs from the historical AEs and incorporates them into the training process using an adaptive mechanism. This mechanism determines an appropriate fusion ratio according to the performance of the AEs on the training model. Coupled with a loss function tailored to the training framework, FGSM-PCO can alleviate catastrophic overfitting and help the recovery of an overfitted model to effective training. We evaluate our algorithm across three models and three datasets to validate its effectiveness. Comparative empirical studies against other FAT algorithms demonstrate that our proposed method effectively addresses unresolved overfitting issues in existing algorithms.

Related papers

Outlier-aware Tensor Robust Principal Component Analysis with Self-guided Data Augmentation [21.981038455329013]
We propose a self-guided data augmentation approach that employs adaptive weighting to suppress outlier influence. We show the improvements in both accuracy and computational efficiency compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-04-25T13:03:35Z)
A Triple-Inertial Accelerated Alternating Optimization Method for Deep Learning Training [3.246129789918632]
gradient descent (SGD) algorithm has achieved remarkable success in training deep learning models. alternating minimization (AM) methods have emerged as a promising alternative for model training. We propose a novel Triple-Inertial Accelerated Alternating Minimization (TIAM) framework for neural network training.
arXiv Detail & Related papers (2025-03-11T14:42:17Z)
Fine Tuning without Catastrophic Forgetting via Selective Low Rank Adaptation [13.084333776247743]
Fine-tuning can reduce robustness to distribution shifts, impacting out-of-distribution (OOD) performance. We propose a parameter-efficient fine-tuning (PEFT) method, using an indicator function to selectively activate Low-Rank Adaptation (LoRA) blocks. We demonstrate that effective fine-tuning can be achieved with as few as 5% of active blocks, substantially improving efficiency.
arXiv Detail & Related papers (2025-01-26T03:22:22Z)
Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails. We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses. C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z)
Reducing Adversarial Training Cost with Gradient Approximation [0.3916094706589679]
We propose a new and efficient adversarial training method, adversarial training with gradient approximation (GAAT) to reduce the cost of building up robust models. Our proposed method saves up to 60% of the training time with comparable model test accuracy on datasets.
arXiv Detail & Related papers (2023-09-18T03:55:41Z)
Improving Fast Adversarial Training with Prior-Guided Knowledge [80.52575209189365]
We investigate the relationship between adversarial example quality and catastrophic overfitting by comparing the training processes of standard adversarial training and Fast adversarial training. We find that catastrophic overfitting occurs when the attack success rate of adversarial examples becomes worse.
arXiv Detail & Related papers (2023-04-01T02:18:12Z)
Prior-Guided Adversarial Initialization for Fast Adversarial Training [84.56377396106447]
We investigate the difference between the training processes of adversarial examples (AEs) of Fast adversarial training (FAT) and standard adversarial training (SAT) We observe that the attack success rate of adversarial examples (AEs) of FAT gets worse gradually in the late training stage, resulting in overfitting. Based on the observation, we propose a prior-guided FGSM initialization method to avoid overfitting. The proposed method can prevent catastrophic overfitting and outperform state-of-the-art FAT methods.
arXiv Detail & Related papers (2022-07-18T18:13:10Z)
Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization [60.72410937614299]
We propose a new tractable bi-level optimization problem, design and analyze a new set of algorithms termed Bi-level AT (FAST-BAT) FAST-BAT is capable of defending sign-based projected descent (PGD) attacks without calling any gradient sign method and explicit robust regularization.
arXiv Detail & Related papers (2021-12-23T06:25:36Z)
Boosting Adversarial Training with Hypersphere Embedding [53.75693100495097]
Adversarial training is one of the most effective defenses against adversarial attacks for deep learning models. In this work, we advocate incorporating the hypersphere embedding mechanism into the AT procedure. We validate our methods under a wide range of adversarial attacks on the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2020-02-20T08:42:29Z)
Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.