Title: {\delta}-SAM: Sharpness-Aware Minimization with Dynamic Reweighting
Authors: Wenxuan Zhou, Muhao Chen
Abstract summary: Adversarial training has shown effectiveness in improving generalization by regularizing the change of loss on top of adversarially chosen perturbations.
The recently proposed sharpness-aware minimization (SAM) algorithm adopts adversarial weight perturbation, encouraging the model to converging to a flat minima.
We propose that dynamically reweighted perturbation within each batch, where unguarded instances are up-weighted, can serve as a better approximation to per-instance perturbation.
Abstract: Deep neural networks are often overparameterized and may not easily achieve
model generalization. Adversarial training has shown effectiveness in improving
generalization by regularizing the change of loss on top of adversarially
chosen perturbations. The recently proposed sharpness-aware minimization (SAM)
algorithm adopts adversarial weight perturbation, encouraging the model to
converging to a flat minima. Unfortunately, due to increased computational
cost, adversarial weight perturbation can only be efficiently approximated
per-batch instead of per-instance, leading to degraded performance. In this
paper, we propose that dynamically reweighted perturbation within each batch,
where unguarded instances are up-weighted, can serve as a better approximation
to per-instance perturbation. We propose sharpness-aware minimization with
dynamic reweighting ({\delta}-SAM), which realizes the idea with efficient
guardedness estimation. Experiments on the GLUE benchmark demonstrate the
effectiveness of {\delta}-SAM.
Related papers
A Universal Class of Sharpness-Aware Minimization Algorithms [57.29207151446387] We introduce a new class of sharpness measures, leading to new sharpness-aware objective functions.
We prove that these measures are textitly expressive, allowing any function of the training loss Hessian matrix to be represented by appropriate hyper and determinants. arXivDetail & Related papers (2024-06-06T01:52:09Z)
Systematic Investigation of Sparse Perturbed Sharpness-Aware
Minimization Optimizer [158.2634766682187] Deep neural networks often suffer from poor generalization due to complex and non- unstructured loss landscapes.
SharpnessAware Minimization (SAM) is a popular solution that smooths the loss by minimizing the change of landscape when adding a perturbation.
In this paper, we propose Sparse SAM (SSAM), an efficient and effective training scheme that achieves perturbation by a binary mask. arXivDetail & Related papers (2023-06-30T09:33:41Z)
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation
Approach [132.37966970098645] One of the popular solutions is Sharpness-Aware Minimization (SAM), which minimizes the change of weight loss when adding a perturbation.
In this paper, we propose an efficient effective training scheme coined as Sparse SAM (SSAM), which achieves double overhead of common perturbations.
In addition, we theoretically prove that S can converge at the same SAM, i.e., $O(log T/sqrtTTTTTTTTTTTTTTTTT arXivDetail & Related papers (2022-10-11T06:30:10Z)
Sharpness-Aware Training for Free [163.1248341911413] SharpnessAware Minimization (SAM) has shown that minimizing a sharpness measure, which reflects the geometry of the loss landscape, can significantly reduce the generalization error.
Sharpness-Aware Training Free (SAF) mitigates the sharp landscape at almost zero computational cost over the base.
SAF ensures the convergence to a flat minimum with improved capabilities. arXivDetail & Related papers (2022-05-27T16:32:43Z)
Efficient Sharpness-aware Minimization for Improved Training of Neural
Networks [146.2011175973769] This paper proposes Efficient Sharpness Aware Minimizer (M) which boosts SAM s efficiency at no cost to its generalization performance.
M includes two novel and efficient training strategies-StochasticWeight Perturbation and Sharpness-Sensitive Data Selection.
We show, via extensive experiments on the CIFAR and ImageNet datasets, that ESAM enhances the efficiency over SAM from requiring 100% extra computations to 40% vis-a-vis bases. arXivDetail & Related papers (2021-10-07T02:20:37Z)
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning
of Deep Neural Networks [2.8292841621378844] We introduce the concept of adaptive sharpness which is scale-invariant and propose the corresponding generalization bound.
We suggest a novel learning method, adaptive sharpness-aware minimization (ASAM), utilizing the proposed generalization bound.
Experimental results in various benchmark datasets show that ASAM contributes to significant improvement of model generalization performance. arXivDetail & Related papers (2021-02-23T10:26:54Z)
Sharpness-Aware Minimization for Efficiently Improving Generalization [36.87818971067698] We introduce a novel, effective procedure for simultaneously minimizing loss value and loss sharpness.
Sharpness-Aware Minimization (SAM) seeks parameters that lie in neighborhoods having uniformly low loss.
We present empirical results showing that SAM improves model generalization across a variety of benchmark datasets. arXivDetail & Related papers (2020-10-03T19:02:10Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214] We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer. arXivDetail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.