Update in Unit Gradient
- URL: http://arxiv.org/abs/2110.00199v1
- Date: Fri, 1 Oct 2021 04:00:51 GMT
- Title: Update in Unit Gradient
- Authors: Ching-Hsun. Tseng, Liu-Hsueh. Cheng, Shin-Jye. Lee, Xiaojun Zeng
- Abstract summary: In Machine Learning, optimization is done by using a descent method to find the minimum value of the loss.
We propose a unit Vector space in algebra, not only consisted the mathematical instinct in algebra but also kept the advantages of gradient algorithm.
- Score: 8.143750358586074
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In Machine Learning, optimization mostly has been done by using a gradient
descent method to find the minimum value of the loss. However, especially in
deep learning, finding a global minimum from a nonconvex loss function across a
high dimensional space is an extraordinarily difficult task. Recently, a
generalization learning algorithm, Sharpness-Aware Minimization (SAM), has made
a great success in image classification task. Despite the great performance in
creating convex space, proper direction leading by SAM is still remained
unclear. We, thereby, propose a creating a Unit Vector space in SAM, which not
only consisted of the mathematical instinct in linear algebra but also kept the
advantages of adaptive gradient algorithm. Moreover, applying SAM in unit
gradient brings models competitive performances in image classification
datasets, such as CIFAR - {10, 100}. The experiment showed that it performed
even better and more robust than SAM.
Related papers
- Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement [29.675650285351768]
Machine unlearning (MU) has emerged to enhance the privacy and trustworthiness of deep neural networks.
Approximate MU is a practical method for large-scale models.
We propose a fast-slow parameter update strategy to implicitly approximate the up-to-date salient unlearning direction.
arXiv Detail & Related papers (2024-09-29T15:17:33Z) - Efficient Sharpness-Aware Minimization for Molecular Graph Transformer Models [42.59948316941217]
Sharpness-aware minimization (SAM) has received increasing attention in computer vision since it can effectively eliminate the sharp local minima from the training trajectory and generalization degradation.
We propose a new algorithm named GraphSAM, which reduces the training cost of SAM and improves the generalization performance of graph transformer models.
arXiv Detail & Related papers (2024-06-19T01:03:23Z) - Friendly Sharpness-Aware Minimization [62.57515991835801]
Sharpness-Aware Minimization (SAM) has been instrumental in improving deep neural network training by minimizing both training loss and loss sharpness.
We investigate the key role of batch-specific gradient noise within the adversarial perturbation, i.e., the current minibatch gradient.
By decomposing the adversarial gradient noise components, we discover that relying solely on the full gradient degrades generalization while excluding it leads to improved performance.
arXiv Detail & Related papers (2024-03-19T01:39:33Z) - TinySAM: Pushing the Envelope for Efficient Segment Anything Model [76.21007576954035]
We propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance.
We first propose a full-stage knowledge distillation method with hard prompt sampling and hard mask weighting strategy to distill a lightweight student model.
We also adapt the post-training quantization to the promptable segmentation task and further reduce the computational cost.
arXiv Detail & Related papers (2023-12-21T12:26:11Z) - Class Gradient Projection For Continual Learning [99.105266615448]
Catastrophic forgetting is one of the most critical challenges in Continual Learning (CL)
We propose Class Gradient Projection (CGP), which calculates the gradient subspace from individual classes rather than tasks.
arXiv Detail & Related papers (2023-11-25T02:45:56Z) - Systematic Investigation of Sparse Perturbed Sharpness-Aware
Minimization Optimizer [158.2634766682187]
Deep neural networks often suffer from poor generalization due to complex and non- unstructured loss landscapes.
SharpnessAware Minimization (SAM) is a popular solution that smooths the loss by minimizing the change of landscape when adding a perturbation.
In this paper, we propose Sparse SAM (SSAM), an efficient and effective training scheme that achieves perturbation by a binary mask.
arXiv Detail & Related papers (2023-06-30T09:33:41Z) - Efficient Generalization Improvement Guided by Random Weight
Perturbation [24.027159739234524]
Gruesome-aware minimization (SAM) establishes a generic scheme for generalization improvements.
We resort to filter-wise random weight perturbations (RWP) to decouple the nested gradients in SAM.
We achieve very competitive performance on CIFAR and remarkably better performance on ImageNet.
arXiv Detail & Related papers (2022-11-21T14:24:34Z) - Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation
Approach [132.37966970098645]
One of the popular solutions is Sharpness-Aware Minimization (SAM), which minimizes the change of weight loss when adding a perturbation.
In this paper, we propose an efficient effective training scheme coined as Sparse SAM (SSAM), which achieves double overhead of common perturbations.
In addition, we theoretically prove that S can converge at the same SAM, i.e., $O(log T/sqrtTTTTTTTTTTTTTTTTT
arXiv Detail & Related papers (2022-10-11T06:30:10Z) - Towards Efficient and Scalable Sharpness-Aware Minimization [81.22779501753695]
We propose a novel algorithm LookSAM that only periodically calculates the inner gradient ascent.
LookSAM achieves similar accuracy gains to SAM while being tremendously faster.
We are the first to successfully scale up the batch size when training Vision Transformers (ViTs)
arXiv Detail & Related papers (2022-03-05T11:53:37Z) - Sharpness-Aware Minimization for Efficiently Improving Generalization [36.87818971067698]
We introduce a novel, effective procedure for simultaneously minimizing loss value and loss sharpness.
Sharpness-Aware Minimization (SAM) seeks parameters that lie in neighborhoods having uniformly low loss.
We present empirical results showing that SAM improves model generalization across a variety of benchmark datasets.
arXiv Detail & Related papers (2020-10-03T19:02:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.