Model Generalization: A Sharpness Aware Optimization Perspective
- URL: http://arxiv.org/abs/2208.06915v1
- Date: Sun, 14 Aug 2022 20:50:17 GMT
- Title: Model Generalization: A Sharpness Aware Optimization Perspective
- Authors: Jozef Marus Coldenhoff, Chengkun Li, Yurui Zhu
- Abstract summary: Sharpness-Aware Minimization (SAM) and adaptive sharpness-aware minimization (ASAM) aim to improve the model generalization.
Our experiments show that sharpness aware-based optimization techniques could help to provide models with strong generalization ability.
- Score: 4.017760528208121
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sharpness-Aware Minimization (SAM) and adaptive sharpness-aware minimization
(ASAM) aim to improve the model generalization. And in this project, we
proposed three experiments to valid their generalization from the sharpness
aware perspective. And our experiments show that sharpness aware-based
optimization techniques could help to provide models with strong generalization
ability. Our experiments also show that ASAM could improve the generalization
performance on un-normalized data, but further research is needed to confirm
this.
Related papers
- 1st-Order Magic: Analysis of Sharpness-Aware Minimization [0.0]
Sharpness-Aware Minimization (SAM) is an optimization technique designed to improve generalization by favoring flatter loss minima.
We find that more precise approximations of the proposed SAM objective degrade generalization performance.
This highlights a gap in our understanding of SAM's effectiveness and calls for further investigation into the role of approximations in optimization.
arXiv Detail & Related papers (2024-11-03T23:50:34Z) - Do Sharpness-based Optimizers Improve Generalization in Medical Image Analysis? [47.346907372319706]
Sharpness-Aware Minimization (SAM) has shown potential in enhancing generalization performance on general domain image datasets.
This work provides a recent sharpness-based methods for improving the generalization of deep learning networks and evaluates the methods on medical breast ultrasound images.
arXiv Detail & Related papers (2024-08-07T20:07:25Z) - Normalization Layers Are All That Sharpness-Aware Minimization Needs [53.799769473526275]
Sharpness-aware minimization (SAM) was proposed to reduce sharpness of minima.
We show that perturbing only the affine normalization parameters (typically comprising 0.1% of the total parameters) in the adversarial step of SAM can outperform perturbing all of the parameters.
arXiv Detail & Related papers (2023-06-07T08:05:46Z) - Improving Sharpness-Aware Minimization with Fisher Mask for Better
Generalization on Language Models [93.85178920914721]
Fine-tuning large pretrained language models on a limited training corpus usually suffers from poor computation.
We propose a novel optimization procedure, namely FSAM, which introduces a Fisher mask to improve the efficiency and performance of SAM.
We show that FSAM consistently outperforms the vanilla SAM by 0.671.98 average score among four different pretrained models.
arXiv Detail & Related papers (2022-10-11T14:53:58Z) - Design Amortization for Bayesian Optimal Experimental Design [70.13948372218849]
We build off of successful variational approaches, which optimize a parameterized variational model with respect to bounds on the expected information gain (EIG)
We present a novel neural architecture that allows experimenters to optimize a single variational model that can estimate the EIG for potentially infinitely many designs.
arXiv Detail & Related papers (2022-10-07T02:12:34Z) - Improving Generalization in Federated Learning by Seeking Flat Minima [23.937135834522145]
Models trained in federated settings often suffer from degraded performances and fail at generalizing.
In this work, we investigate such behavior through the lens of geometry of the loss and Hessian eigenspectrum.
Motivated by prior studies connecting the sharpness of the loss surface and the generalization gap, we show that i) training clients locally with Sharpness-Aware Minimization (SAM) or its adaptive version (ASAM) on the server-side can substantially improve generalization.
arXiv Detail & Related papers (2022-03-22T16:01:04Z) - Sharpness-Aware Minimization Improves Language Model Generalization [46.83888240127077]
We show that Sharpness-Aware Minimization (SAM) can substantially improve the generalization of language models without much computational overhead.
We show that SAM is able to boost performance on SuperGLUE, GLUE, Web Questions, Natural Questions, Trivia QA, and TyDiQA, with particularly large gains when training data for these tasks is limited.
arXiv Detail & Related papers (2021-10-16T09:44:06Z) - Efficient Sharpness-aware Minimization for Improved Training of Neural
Networks [146.2011175973769]
This paper proposes Efficient Sharpness Aware Minimizer (M) which boosts SAM s efficiency at no cost to its generalization performance.
M includes two novel and efficient training strategies-StochasticWeight Perturbation and Sharpness-Sensitive Data Selection.
We show, via extensive experiments on the CIFAR and ImageNet datasets, that ESAM enhances the efficiency over SAM from requiring 100% extra computations to 40% vis-a-vis bases.
arXiv Detail & Related papers (2021-10-07T02:20:37Z) - ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning
of Deep Neural Networks [2.8292841621378844]
We introduce the concept of adaptive sharpness which is scale-invariant and propose the corresponding generalization bound.
We suggest a novel learning method, adaptive sharpness-aware minimization (ASAM), utilizing the proposed generalization bound.
Experimental results in various benchmark datasets show that ASAM contributes to significant improvement of model generalization performance.
arXiv Detail & Related papers (2021-02-23T10:26:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.