Related papers: Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization

Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization

URL: http://arxiv.org/abs/2406.08001v1
Date: Wed, 12 Jun 2024 08:47:44 GMT
Title: Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization
Authors: Jiaxin Deng, Junbiao Pang, Baochang Zhang,
Abstract summary: We propose Asymptotic Unbiased Sampling to accelerate Sharpness-Aware Minimization (AUSAM) AUSAM maintains the model's generalization capacity while significantly enhancing computational efficiency. As a plug-and-play, architecture-agnostic method, our approach consistently accelerates SAM across a range of tasks and networks.
Score: 17.670203551488218
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sharpness-Aware Minimization (SAM) has emerged as a promising approach for effectively reducing the generalization error. However, SAM incurs twice the computational cost compared to base optimizer (e.g., SGD). We propose Asymptotic Unbiased Sampling with respect to iterations to accelerate SAM (AUSAM), which maintains the model's generalization capacity while significantly enhancing computational efficiency. Concretely, we probabilistically sample a subset of data points beneficial for SAM optimization based on a theoretically guaranteed criterion, i.e., the Gradient Norm of each Sample (GNS). We further approximate the GNS by the difference in loss values before and after perturbation in SAM. As a plug-and-play, architecture-agnostic method, our approach consistently accelerates SAM across a range of tasks and networks, i.e., classification, human pose estimation and network quantization. On CIFAR10/100 and Tiny-ImageNet, AUSAM achieves results comparable to SAM while providing a speedup of over 70%. Compared to recent dynamic data pruning methods, AUSAM is better suited for SAM and excels in maintaining performance. Additionally, AUSAM accelerates optimization in human pose estimation and model quantization without sacrificing performance, demonstrating its broad practicality.

Related papers

Sparse Layer Sharpness-Aware Minimization for Efficient Fine-Tuning [52.63618112418439]
Sharpness-aware computation (SAM) seeks the minima with a flat loss landscape to improve the generalization performance in machine learning tasks, including fine-tuning.<n>We propose an approach SL-SAM to break this bottleneck by introducing the sparse technique to layers.
arXiv Detail & Related papers (2026-02-10T04:05:43Z)
LightSAM: Parameter-Agnostic Sharpness-Aware Minimization [92.17866492331524]
Sharpness-Aware Minimization (SAM) enhances the ability of the machine learning model by exploring the flat minima landscape through weight perturbations.<n>SAM introduces an additional hyper- parameter, the perturbation radius, which causes the sensitivity of SAM to it.<n>In this paper, we propose the algorithm LightSAM which sets the perturbation radius and learning rate of SAM adaptively.
arXiv Detail & Related papers (2025-05-30T09:28:38Z)
Asynchronous Sharpness-Aware Minimization For Fast and Accurate Deep Learning [5.77502465665279]
Sharpness-Aware Minimization (SAM) is an optimization method that improves generalization performance of machine learning models. Despite its superior generalization, SAM has not been actively used in real-world applications due to its expensive computational cost. We propose a novel asynchronous-parallel SAM which achieves nearly the same gradient normizing effect like the original SAM while breaking the data dependency between the model perturbation and the model update.
arXiv Detail & Related papers (2025-03-14T07:34:39Z)
SAMPa: Sharpness-aware Minimization Parallelized [51.668052890249726]
Sharpness-aware (SAM) has been shown to improve the generalization of neural networks. Each SAM update requires emphsequentially computing two gradients, effectively doubling the per-iteration cost. We propose a simple modification of SAM, termed SAMPa, which allows us to fully parallelize the two gradient computations.
arXiv Detail & Related papers (2024-10-14T16:21:23Z)
Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy [12.050160495730381]
sharpness-aware generalization (SAM) has attracted much attention because of its surprising effectiveness in improving performance. We propose a simple renormalization strategy, dubbed Stable SAM (SSAM), so that the gradient norm of the descent step maintains the same as that of the ascent step. Our strategy is easy to implement and flexible enough to integrate with SAM and its variants, almost at no computational cost.
arXiv Detail & Related papers (2024-01-14T10:53:36Z)
Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer [158.2634766682187]
Deep neural networks often suffer from poor generalization due to complex and non- unstructured loss landscapes. SharpnessAware Minimization (SAM) is a popular solution that smooths the loss by minimizing the change of landscape when adding a perturbation. In this paper, we propose Sparse SAM (SSAM), an efficient and effective training scheme that achieves perturbation by a binary mask.
arXiv Detail & Related papers (2023-06-30T09:33:41Z)
An Adaptive Policy to Employ Sharpness-Aware Minimization [5.5347134457499845]
Sharpness-aware minimization (SAM) searches for flat minima by min-max optimization. Recent state-of-the-arts reduce the fraction of SAM updates. Two efficient algorithms, AE-SAM and AE-LookSAM, are proposed.
arXiv Detail & Related papers (2023-04-28T06:23:32Z)
AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks [76.90477930208982]
Sharpness aware (SAM) has been extensively explored as it can generalize better for training deep neural networks. Integrating SAM with adaptive learning perturbation and momentum acceleration, dubbed AdaSAM, has already been explored. We conduct several experiments on several NLP tasks, which show that AdaSAM could achieve superior performance compared with SGD, AMS, and SAMsGrad.
arXiv Detail & Related papers (2023-03-01T15:12:42Z)
Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization [14.40189851070842]
Sharpness-Aware Minimization (SAM) modifies the underlying loss function to guide descent methods towards flatter minima. Recent work suggests that mSAM can outperform SAM in terms of test accuracy. This paper presents a comprehensive empirical evaluation of mSAM on various tasks and datasets.
arXiv Detail & Related papers (2022-12-07T00:37:55Z)
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach [132.37966970098645]
One of the popular solutions is Sharpness-Aware Minimization (SAM), which minimizes the change of weight loss when adding a perturbation. In this paper, we propose an efficient effective training scheme coined as Sparse SAM (SSAM), which achieves double overhead of common perturbations. In addition, we theoretically prove that S can converge at the same SAM, i.e., $O(log T/sqrtTTTTTTTTTTTTTTTTT
arXiv Detail & Related papers (2022-10-11T06:30:10Z)
Towards Efficient and Scalable Sharpness-Aware Minimization [81.22779501753695]
We propose a novel algorithm LookSAM that only periodically calculates the inner gradient ascent. LookSAM achieves similar accuracy gains to SAM while being tremendously faster. We are the first to successfully scale up the batch size when training Vision Transformers (ViTs)
arXiv Detail & Related papers (2022-03-05T11:53:37Z)
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks [146.2011175973769]
This paper proposes Efficient Sharpness Aware Minimizer (M) which boosts SAM s efficiency at no cost to its generalization performance. M includes two novel and efficient training strategies-StochasticWeight Perturbation and Sharpness-Sensitive Data Selection. We show, via extensive experiments on the CIFAR and ImageNet datasets, that ESAM enhances the efficiency over SAM from requiring 100% extra computations to 40% vis-a-vis bases.
arXiv Detail & Related papers (2021-10-07T02:20:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.