Application of Monte Carlo Stochastic Optimization (MOST) to Deep
Learning
- URL: http://arxiv.org/abs/2109.02441v1
- Date: Thu, 2 Sep 2021 05:52:26 GMT
- Title: Application of Monte Carlo Stochastic Optimization (MOST) to Deep
Learning
- Authors: Sin-ichi Inage, Hana Hebishima
- Abstract summary: In this paper, we apply the Monte Carlo optimization (MOST) proposed by the authors to a deep learning of XOR gate.
As a result, it was confirmed that it converged faster than the existing method.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we apply the Monte Carlo stochastic optimization (MOST)
proposed by the authors to a deep learning of XOR gate and verify its
effectiveness. Deep machine learning based on neural networks is one of the
most important keywords driving innovation in today's highly advanced
information society. Therefore, there has been active research on large-scale,
high-speed, and high-precision systems. For the purpose of efficiently
searching the optimum value of the objective function, the author divides the
search region of a multivariable parameter constituting the objective function
into two by each parameter, numerically finds the integration of the two
regions by the Monte Carlo method, compares the magnitude of the integration
value, and judges that there is an optimum point in a small region. In the
previous paper, we examined the problem of the benchmark in the optimization
method. This method is applied to neural networks of XOR gate, and compared
with the results of weight factor optimization by Adam and genetic algorithm.
As a result, it was confirmed that it converged faster than the existing
method.
Related papers
- Enhancing CNN Classification with Lamarckian Memetic Algorithms and Local Search [0.0]
We propose a novel approach integrating a two-stage training technique with population-based optimization algorithms incorporating local search capabilities.
Our experiments demonstrate that the proposed method outperforms state-of-the-art gradient-based techniques.
arXiv Detail & Related papers (2024-10-26T17:31:15Z) - Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks [2.0072624123275533]
The generalized Gauss-Newton (GGN) optimization method incorporates curvature estimates into its solution steps.
This work studies a GGN method for optimizing a two-layer neural network with explicit regularization.
arXiv Detail & Related papers (2024-04-23T10:02:22Z) - Localized Zeroth-Order Prompt Optimization [54.964765668688806]
We propose a novel algorithm, namely localized zeroth-order prompt optimization (ZOPO)
ZOPO incorporates a Neural Tangent Kernel-based derived Gaussian process into standard zeroth-order optimization for an efficient search of well-performing local optima in prompt optimization.
Remarkably, ZOPO outperforms existing baselines in terms of both the optimization performance and the query efficiency.
arXiv Detail & Related papers (2024-03-05T14:18:15Z) - Bidirectional Looking with A Novel Double Exponential Moving Average to
Adaptive and Non-adaptive Momentum Optimizers [109.52244418498974]
We propose a novel textscAdmeta (textbfADouble exponential textbfMov averagtextbfE textbfAdaptive and non-adaptive momentum) framework.
We provide two implementations, textscAdmetaR and textscAdmetaS, the former based on RAdam and the latter based on SGDM.
arXiv Detail & Related papers (2023-07-02T18:16:06Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - DC and SA: Robust and Efficient Hyperparameter Optimization of
Multi-subnetwork Deep Learning Models [0.974672460306765]
We present two novel strategies for optimization of deep learning models with a modular architecture constructed of multipleworks.
Our approaches show an increased optimization efficiency of up to 23.62x, and a final performance boost of up to 3.5% accuracy for classification and 4.4 MSE for regression.
arXiv Detail & Related papers (2022-02-24T00:29:05Z) - Acceleration techniques for optimization over trained neural network
ensembles [1.0323063834827415]
We study optimization problems where the objective function is modeled through feedforward neural networks with rectified linear unit activation.
We present a mixed-integer linear program based on existing popular big-$M$ formulations for optimizing over a single neural network.
arXiv Detail & Related papers (2021-12-13T20:50:54Z) - Meta Learning Black-Box Population-Based Optimizers [0.0]
We propose the use of meta-learning to infer population-based blackbox generalizations.
We show that the meta-loss function encourages a learned algorithm to alter its search behavior so that it can easily fit into a new context.
arXiv Detail & Related papers (2021-03-05T08:13:25Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.