Related papers: Agnostic Sharpness-Aware Minimization

Agnostic Sharpness-Aware Minimization

URL: http://arxiv.org/abs/2406.07107v3
Date: Wed, 02 Oct 2024 13:38:28 GMT
Title: Agnostic Sharpness-Aware Minimization
Authors: Van-Anh Nguyen, Quyen Tran, Tuan Truong, Thanh-Toan Do, Dinh Phung, Trung Le,
Abstract summary: Sharpness-aware (SAM) has been instrumental in improving deep neural network training by minimizing both the training loss and the sharpness of the loss landscape. Model-Agnostic Meta-Learning (MAML) is a framework designed to improve the adaptability of models. We introduce Agnostic-SAM, a novel approach that combines the principles of both SAM and MAML.
Score: 29.641227264358704
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sharpness-aware minimization (SAM) has been instrumental in improving deep neural network training by minimizing both the training loss and the sharpness of the loss landscape, leading the model into flatter minima that are associated with better generalization properties. In another aspect, Model-Agnostic Meta-Learning (MAML) is a framework designed to improve the adaptability of models. MAML optimizes a set of meta-models that are specifically tailored for quick adaptation to multiple tasks with minimal fine-tuning steps and can generalize well with limited data. In this work, we explore the connection between SAM and MAML in enhancing model generalization. We introduce Agnostic-SAM, a novel approach that combines the principles of both SAM and MAML. Agnostic-SAM adapts the core idea of SAM by optimizing the model toward wider local minima using training data, while concurrently maintaining low loss values on validation data. By doing so, it seeks flatter minima that are not only robust to small perturbations but also less vulnerable to data distributional shift problems. Our experimental results demonstrate that Agnostic-SAM significantly improves generalization over baselines across a range of datasets and under challenging conditions such as noisy labels or data limitation.

Related papers

Meta Curvature-Aware Minimization for Domain Generalization [22.824033201965648]
We propose an improved model training process aimed at encouraging the model to converge to a flat minima. We derive a novel algorithm called Meta Curvature-Aware Minimization (MeCAM) to minimize the curvature around the local minima. We provide theoretical analysis on MeCAM's generalization error and convergence rate, and demonstrate its superiority over existing DG methods.
arXiv Detail & Related papers (2024-12-16T08:22:23Z)
DiM: $f$-Divergence Minimization Guided Sharpness-Aware Optimization for Semi-supervised Medical Image Segmentation [8.70112307145508]
We propose a sharpness-aware optimization method based on $f$-divergence minimization. This method enhances the model's stability by fine-tuning the sensitivity of model parameters. It also improves the model's adaptability to different datasets through the introduction of $f$-divergence.
arXiv Detail & Related papers (2024-11-19T09:07:26Z)
Reweighting Local Mimina with Tilted SAM [24.689230137012174]
Sharpness-Aware Minimization (SAM) has been demonstrated to improve the generalization performance of over infinity by seeking flat minima on flatter loss. In this work, we propose TSAM (TSAM) that effectively assigns higher priority to local solutions that are flatter and that incur losses.
arXiv Detail & Related papers (2024-10-30T02:49:48Z)
Fast Graph Sharpness-Aware Minimization for Enhancing and Accelerating Few-Shot Node Classification [53.727688136434345]
Graph Neural Networks (GNNs) have shown superior performance in node classification. We present Fast Graph Sharpness-Aware Minimization (FGSAM) that integrates the rapid training of Multi-Layer Perceptrons with the superior performance of GNNs. Our proposed algorithm outperforms the standard SAM with lower computational costs in FSNC tasks.
arXiv Detail & Related papers (2024-10-22T09:33:29Z)
CR-SAM: Curvature Regularized Sharpness-Aware Minimization [8.248964912483912]
Sharpness-Aware Minimization (SAM) aims to enhance the generalizability by minimizing worst-case loss using one-step gradient ascent as an approximation. In this paper, we introduce a normalized Hessian trace to accurately measure the curvature of loss landscape on em both training and test sets. In particular, to counter excessive non-linearity of loss landscape, we propose Curvature Regularized SAM (CR-SAM)
arXiv Detail & Related papers (2023-12-21T03:46:29Z)
Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer [158.2634766682187]
Deep neural networks often suffer from poor generalization due to complex and non- unstructured loss landscapes. SharpnessAware Minimization (SAM) is a popular solution that smooths the loss by minimizing the change of landscape when adding a perturbation. In this paper, we propose Sparse SAM (SSAM), an efficient and effective training scheme that achieves perturbation by a binary mask.
arXiv Detail & Related papers (2023-06-30T09:33:41Z)
Sharpness-Aware Gradient Matching for Domain Generalization [84.14789746460197]
The goal of domain generalization (DG) is to enhance the generalization capability of the model learned from a source domain to other unseen domains. The recently developed Sharpness-Aware Minimization (SAM) method aims to achieve this goal by minimizing the sharpness measure of the loss landscape. We present two conditions to ensure that the model could converge to a flat minimum with a small loss, and present an algorithm, named Sharpness-Aware Gradient Matching (SAGM) Our proposed SAGM method consistently outperforms the state-of-the-art methods on five DG benchmarks.
arXiv Detail & Related papers (2023-03-18T07:25:12Z)
Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization [14.40189851070842]
Sharpness-Aware Minimization (SAM) modifies the underlying loss function to guide descent methods towards flatter minima. Recent work suggests that mSAM can outperform SAM in terms of test accuracy. This paper presents a comprehensive empirical evaluation of mSAM on various tasks and datasets.
arXiv Detail & Related papers (2022-12-07T00:37:55Z)
Sharpness-Aware Training for Free [163.1248341911413]
SharpnessAware Minimization (SAM) has shown that minimizing a sharpness measure, which reflects the geometry of the loss landscape, can significantly reduce the generalization error. Sharpness-Aware Training Free (SAF) mitigates the sharp landscape at almost zero computational cost over the base. SAF ensures the convergence to a flat minimum with improved capabilities.
arXiv Detail & Related papers (2022-05-27T16:32:43Z)
Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks. Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients. We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z)
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks [146.2011175973769]
This paper proposes Efficient Sharpness Aware Minimizer (M) which boosts SAM s efficiency at no cost to its generalization performance. M includes two novel and efficient training strategies-StochasticWeight Perturbation and Sharpness-Sensitive Data Selection. We show, via extensive experiments on the CIFAR and ImageNet datasets, that ESAM enhances the efficiency over SAM from requiring 100% extra computations to 40% vis-a-vis bases.
arXiv Detail & Related papers (2021-10-07T02:20:37Z)
Sharpness-Aware Minimization for Efficiently Improving Generalization [36.87818971067698]
We introduce a novel, effective procedure for simultaneously minimizing loss value and loss sharpness. Sharpness-Aware Minimization (SAM) seeks parameters that lie in neighborhoods having uniformly low loss. We present empirical results showing that SAM improves model generalization across a variety of benchmark datasets.
arXiv Detail & Related papers (2020-10-03T19:02:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.