Flatness-aware Curriculum Learning via Adversarial Difficulty
- URL: http://arxiv.org/abs/2508.18726v1
- Date: Tue, 26 Aug 2025 06:51:07 GMT
- Title: Flatness-aware Curriculum Learning via Adversarial Difficulty
- Authors: Hiroaki Aizawa, Yoshikazu Hayashi,
- Abstract summary: We propose Adversarial Difficulty Measure (ADM) to measure adversarial vulnerability.<n>ADM measures the normalized loss gap between original and adversarial examples.<n>We evaluate our approach on image classification tasks, fine-grained recognition, and domain generalization.
- Score: 0.42970700836450487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks trained by empirical risk minimization often suffer from overfitting, especially to specific samples or domains, which leads to poor generalization. Curriculum Learning (CL) addresses this issue by selecting training samples based on the difficulty. From the optimization perspective, methods such as Sharpness-Aware Minimization (SAM) improve robustness and generalization by seeking flat minima. However, combining CL with SAM is not straightforward. In flat regions, both the loss values and the gradient norms tend to become uniformly small, which makes it difficult to evaluate sample difficulty and design an effective curriculum. To overcome this problem, we propose the Adversarial Difficulty Measure (ADM), which quantifies adversarial vulnerability by leveraging the robustness properties of models trained toward flat minima. Unlike loss- or gradient-based measures, which become ineffective as training progresses into flatter regions, ADM remains informative by measuring the normalized loss gap between original and adversarial examples. We incorporate ADM into CL-based training with SAM to dynamically assess sample difficulty. We evaluated our approach on image classification tasks, fine-grained recognition, and domain generalization. The results demonstrate that our method preserves the strengths of both CL and SAM while outperforming existing curriculum-based and flatness-aware training strategies.
Related papers
- Is Difficulty Calibration All We Need? Towards More Practical Membership Inference Attacks [16.064233621959538]
We propose a query-efficient and computation-efficient MIA that directly textbfRe-levertextbfAges the original membershitextbfP scores to mtextbfItigate the errors in textbfDifficulty calibration.
arXiv Detail & Related papers (2024-08-31T11:59:42Z) - Making Robust Generalizers Less Rigid with Loss Concentration [6.527016551650139]
Methods applying sharpness-aware minimization have proven successful for image classification tasks.<n>We show how such a strategy can dramatically break down under simpler models where the difficulty gap becomes more extreme.<n>We propose and evaluate a training criterion which penalizes poor loss concentration.
arXiv Detail & Related papers (2024-08-07T08:23:42Z) - Towards Robust Federated Learning via Logits Calibration on Non-IID Data [49.286558007937856]
Federated learning (FL) is a privacy-preserving distributed management framework based on collaborative model training of distributed devices in edge networks.
Recent studies have shown that FL is vulnerable to adversarial examples, leading to a significant drop in its performance.
In this work, we adopt the adversarial training (AT) framework to improve the robustness of FL models against adversarial example (AE) attacks.
arXiv Detail & Related papers (2024-03-05T09:18:29Z) - Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy [12.050160495730381]
sharpness-aware generalization (SAM) has attracted much attention because of its surprising effectiveness in improving performance.
We propose a simple renormalization strategy, dubbed Stable SAM (SSAM), so that the gradient norm of the descent step maintains the same as that of the ascent step.
Our strategy is easy to implement and flexible enough to integrate with SAM and its variants, almost at no computational cost.
arXiv Detail & Related papers (2024-01-14T10:53:36Z) - How the level sampling process impacts zero-shot generalisation in deep
reinforcement learning [12.79149059358717]
Key limitation preventing wider adoption of autonomous agents trained via deep reinforcement learning is their limited ability to generalise to new environments.
We investigate how a non-uniform sampling strategy of individual environment instances affects the zero-shot generalisation ability of RL agents.
arXiv Detail & Related papers (2023-10-05T12:08:12Z) - Beyond Empirical Risk Minimization: Local Structure Preserving
Regularization for Improving Adversarial Robustness [28.853413482357634]
Local Structure Preserving (LSP) regularization aims to preserve the local structure of the input space in the learned embedding space.
In this work, we propose a novel Local Structure Preserving (LSP) regularization, which aims to preserve the local structure of the input space in the learned embedding space.
arXiv Detail & Related papers (2023-03-29T17:18:58Z) - Mixed-order self-paced curriculum learning for universal lesion
detection [36.198165949330566]
Self-paced curriculum learning (SCL) has demonstrated its great potential in computer vision, natural language processing, etc.
It implements easy-to-hard sampling based on online estimation of data difficulty.
Most SCL methods adopt a loss-based strategy of estimating data difficulty and deweighting the hard' samples in the early training stage.
arXiv Detail & Related papers (2023-02-09T14:52:44Z) - Sharpness-Aware Training for Free [163.1248341911413]
SharpnessAware Minimization (SAM) has shown that minimizing a sharpness measure, which reflects the geometry of the loss landscape, can significantly reduce the generalization error.
Sharpness-Aware Training Free (SAF) mitigates the sharp landscape at almost zero computational cost over the base.
SAF ensures the convergence to a flat minimum with improved capabilities.
arXiv Detail & Related papers (2022-05-27T16:32:43Z) - Unbiased Risk Estimators Can Mislead: A Case Study of Learning with
Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels.
We show that the quality of gradient estimation matters more in risk minimization.
We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z) - On the Loss Landscape of Adversarial Training: Identifying Challenges
and How to Overcome Them [57.957466608543676]
We analyze the influence of adversarial training on the loss landscape of machine learning models.
We show that the adversarial loss landscape is less favorable to optimization, due to increased curvature and more scattered gradients.
arXiv Detail & Related papers (2020-06-15T13:50:23Z) - Adaptive Adversarial Logits Pairing [65.51670200266913]
An adversarial training solution Adversarial Logits Pairing (ALP) tends to rely on fewer high-contribution features compared with vulnerable ones.
Motivated by these observations, we design an Adaptive Adversarial Logits Pairing (AALP) solution by modifying the training process and training target of ALP.
AALP consists of an adaptive feature optimization module with Guided Dropout to systematically pursue fewer high-contribution features.
arXiv Detail & Related papers (2020-05-25T03:12:20Z) - CurricularFace: Adaptive Curriculum Learning Loss for Deep Face
Recognition [79.92240030758575]
We propose a novel Adaptive Curriculum Learning loss (CurricularFace) that embeds the idea of curriculum learning into the loss function.
Our CurricularFace adaptively adjusts the relative importance of easy and hard samples during different training stages.
arXiv Detail & Related papers (2020-04-01T08:43:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.