Related papers: Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm

Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm

URL: http://arxiv.org/abs/2501.06603v1
Date: Sat, 11 Jan 2025 18:05:33 GMT
Title: Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm
Authors: Yilang Zhang, Bingcong Li, Georgios B. Giannakis,
Abstract summary: sharpness-aware minimization (SAM) has emerged as a powerful tool to improve generalizability of deep neural network based learning.<n>This contribution leverages preconditioning (pre) to unify SAM variants and provide not only unifying convergence analysis, but also valuable insights.<n>A novel algorithm termed infoSAM is introduced to address the so-called adversarial model degradation issue in SAM by adjusting gradients depending on noise estimates.
Score: 39.656014609027494
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Targeting solutions over `flat' regions of the loss landscape, sharpness-aware minimization (SAM) has emerged as a powerful tool to improve generalizability of deep neural network based learning. While several SAM variants have been developed to this end, a unifying approach that also guides principled algorithm design has been elusive. This contribution leverages preconditioning (pre) to unify SAM variants and provide not only unifying convergence analysis, but also valuable insights. Building upon preSAM, a novel algorithm termed infoSAM is introduced to address the so-called adversarial model degradation issue in SAM by adjusting gradients depending on noise estimates. Extensive numerical tests demonstrate the superiority of infoSAM across various benchmarks.

Related papers

Sparse Layer Sharpness-Aware Minimization for Efficient Fine-Tuning [52.63618112418439]
Sharpness-aware computation (SAM) seeks the minima with a flat loss landscape to improve the generalization performance in machine learning tasks, including fine-tuning.<n>We propose an approach SL-SAM to break this bottleneck by introducing the sparse technique to layers.
arXiv Detail & Related papers (2026-02-10T04:05:43Z)
Sharpness-Aware Minimization: General Analysis and Improved Rates [10.11126899274029]
Sharpness-Aware Minimization (SAM) has emerged as a powerful method for improving generalization in machine learning models. We provide an analysis of SAM and its unnormalized variant rule rule (USAM) under one update. We present results of the new size under a relaxed more natural assumption.
arXiv Detail & Related papers (2025-03-04T03:04:06Z)
Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization [17.670203551488218]
We propose Asymptotic Unbiased Sampling to accelerate Sharpness-Aware Minimization (AUSAM) AUSAM maintains the model's generalization capacity while significantly enhancing computational efficiency. As a plug-and-play, architecture-agnostic method, our approach consistently accelerates SAM across a range of tasks and networks.
arXiv Detail & Related papers (2024-06-12T08:47:44Z)
Friendly Sharpness-Aware Minimization [62.57515991835801]
Sharpness-Aware Minimization (SAM) has been instrumental in improving deep neural network training by minimizing both training loss and loss sharpness. We investigate the key role of batch-specific gradient noise within the adversarial perturbation, i.e., the current minibatch gradient. By decomposing the adversarial gradient noise components, we discover that relying solely on the full gradient degrades generalization while excluding it leads to improved performance.
arXiv Detail & Related papers (2024-03-19T01:39:33Z)
Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy [12.050160495730381]
sharpness-aware generalization (SAM) has attracted much attention because of its surprising effectiveness in improving performance. We propose a simple renormalization strategy, dubbed Stable SAM (SSAM), so that the gradient norm of the descent step maintains the same as that of the ascent step. Our strategy is easy to implement and flexible enough to integrate with SAM and its variants, almost at no computational cost.
arXiv Detail & Related papers (2024-01-14T10:53:36Z)
Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer [158.2634766682187]
Deep neural networks often suffer from poor generalization due to complex and non- unstructured loss landscapes. SharpnessAware Minimization (SAM) is a popular solution that smooths the loss by minimizing the change of landscape when adding a perturbation. In this paper, we propose Sparse SAM (SSAM), an efficient and effective training scheme that achieves perturbation by a binary mask.
arXiv Detail & Related papers (2023-06-30T09:33:41Z)
AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks [76.90477930208982]
Sharpness aware (SAM) has been extensively explored as it can generalize better for training deep neural networks. Integrating SAM with adaptive learning perturbation and momentum acceleration, dubbed AdaSAM, has already been explored. We conduct several experiments on several NLP tasks, which show that AdaSAM could achieve superior performance compared with SGD, AMS, and SAMsGrad.
arXiv Detail & Related papers (2023-03-01T15:12:42Z)
On Statistical Properties of Sharpness-Aware Minimization: Provable Guarantees [5.91402820967386]
We present a new theoretical explanation of why Sharpness-Aware Minimization (SAM) generalizes well. SAM is particularly well-suited for both sharp and non-sharp problems. Our findings are validated using numerical experiments on deep neural networks.
arXiv Detail & Related papers (2023-02-23T07:52:31Z)
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization [20.560184120992094]
Sharpness-Aware Minimization technique modifies the fundamental loss function that steers gradient descent methods toward flatter minima. We extend a recently developed and well-studied general framework for flatness analysis to theoretically show that SAM achieves flatter minima than SGD, and mSAM achieves even flatter minima than SAM.
arXiv Detail & Related papers (2023-02-19T23:27:12Z)
Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization [14.40189851070842]
Sharpness-Aware Minimization (SAM) modifies the underlying loss function to guide descent methods towards flatter minima. Recent work suggests that mSAM can outperform SAM in terms of test accuracy. This paper presents a comprehensive empirical evaluation of mSAM on various tasks and datasets.
arXiv Detail & Related papers (2022-12-07T00:37:55Z)
Sharpness-Aware Training for Free [163.1248341911413]
SharpnessAware Minimization (SAM) has shown that minimizing a sharpness measure, which reflects the geometry of the loss landscape, can significantly reduce the generalization error. Sharpness-Aware Training Free (SAF) mitigates the sharp landscape at almost zero computational cost over the base. SAF ensures the convergence to a flat minimum with improved capabilities.
arXiv Detail & Related papers (2022-05-27T16:32:43Z)
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks [146.2011175973769]
This paper proposes Efficient Sharpness Aware Minimizer (M) which boosts SAM s efficiency at no cost to its generalization performance. M includes two novel and efficient training strategies-StochasticWeight Perturbation and Sharpness-Sensitive Data Selection. We show, via extensive experiments on the CIFAR and ImageNet datasets, that ESAM enhances the efficiency over SAM from requiring 100% extra computations to 40% vis-a-vis bases.
arXiv Detail & Related papers (2021-10-07T02:20:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.