Stochastic batch size for adaptive regularization in deep network
optimization
- URL: http://arxiv.org/abs/2004.06341v1
- Date: Tue, 14 Apr 2020 07:54:53 GMT
- Title: Stochastic batch size for adaptive regularization in deep network
optimization
- Authors: Kensuke Nakamura, Stefano Soatto, Byung-Woo Hong
- Abstract summary: We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework.
We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
- Score: 63.68104397173262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a first-order stochastic optimization algorithm incorporating
adaptive regularization applicable to machine learning problems in deep
learning framework. The adaptive regularization is imposed by stochastic
process in determining batch size for each model parameter at each optimization
iteration. The stochastic batch size is determined by the update probability of
each parameter following a distribution of gradient norms in consideration of
their local and global properties in the neural network architecture where the
range of gradient norms may vary within and across layers. We empirically
demonstrate the effectiveness of our algorithm using an image classification
task based on conventional network models applied to commonly used benchmark
datasets. The quantitative evaluation indicates that our algorithm outperforms
the state-of-the-art optimization algorithms in generalization while providing
less sensitivity to the selection of batch size which often plays a critical
role in optimization, thus achieving more robustness to the selection of
regularity.
Related papers
- Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Learning Regions of Interest for Bayesian Optimization with Adaptive
Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest.
We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z) - Faster Margin Maximization Rates for Generic and Adversarially Robust Optimization Methods [20.118513136686452]
First-order optimization methods tend to inherently favor certain solutions over others when minimizing an underdetermined training objective.
We present a series of state-of-the-art implicit bias rates for mirror descent and steepest descent algorithms.
Our accelerated rates are derived by leveraging the regret bounds of online learning algorithms within this game framework.
arXiv Detail & Related papers (2023-05-27T18:16:56Z) - Adaptive Experimentation at Scale: A Computational Framework for
Flexible Batches [7.390918770007728]
Motivated by practical instances involving a handful of reallocations in which outcomes are measured in batches, we develop an adaptive-driven experimentation framework.
Our main observation is that normal approximations, which are universal in statistical inference, can also guide the design of adaptive algorithms.
arXiv Detail & Related papers (2023-03-21T04:17:03Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Evolutionary Variational Optimization of Generative Models [0.0]
We combine two popular optimization approaches to derive learning algorithms for generative models: variational optimization and evolutionary algorithms.
We show that evolutionary algorithms can effectively and efficiently optimize the variational bound.
In the category of "zero-shot" learning, we observed the evolutionary variational algorithm to significantly improve the state-of-the-art in many benchmark settings.
arXiv Detail & Related papers (2020-12-22T19:06:33Z) - On the implementation of a global optimization method for mixed-variable
problems [0.30458514384586394]
The algorithm is based on the radial basis function of Gutmann and the metric response surface method of Regis and Shoemaker.
We propose several modifications aimed at generalizing and improving these two algorithms.
arXiv Detail & Related papers (2020-09-04T13:36:56Z) - Obtaining Adjustable Regularization for Free via Iterate Averaging [43.75491612671571]
Regularization for optimization is a crucial technique to avoid overfitting in machine learning.
We establish an averaging scheme that converts the iterates of SGD on an arbitrary strongly convex and smooth objective function to its regularized counterpart.
Our approaches can be used for accelerated and preconditioned optimization methods as well.
arXiv Detail & Related papers (2020-08-15T15:28:05Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization [71.03797261151605]
Adaptivity is an important yet under-studied property in modern optimization theory.
Our algorithm is proved to achieve the best-available convergence for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
arXiv Detail & Related papers (2020-02-13T05:42:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.