Mixed Newton Method for Optimization in Complex Spaces
- URL: http://arxiv.org/abs/2407.20367v2
- Date: Wed, 13 Nov 2024 20:29:35 GMT
- Title: Mixed Newton Method for Optimization in Complex Spaces
- Authors: Nikita Yudin, Roland Hildebrand, Sergey Bakhurin, Alexander Degtyarev, Anna Lisachenko, Ilya Kuruzov, Andrei Semenov, Mohammad Alkousa,
- Abstract summary: We show that arbitrary regularizations preserve the favorable local convergence properties of the Mixed Newton Method.
We compare several variants of the method applied to training neural networks with real and complex parameters.
- Score: 32.73124984242397
- License:
- Abstract: In this paper, we modify and apply the recently introduced Mixed Newton Method, which is originally designed for minimizing real-valued functions of complex variables, to the minimization of real-valued functions of real variables by extending the functions to complex space. We show that arbitrary regularizations preserve the favorable local convergence properties of the method, and construct a special type of regularization used to prevent convergence to complex minima. We compare several variants of the method applied to training neural networks with real and complex parameters.
Related papers
- Mapping-to-Parameter Nonlinear Functional Regression with Novel B-spline
Free Knot Placement Algorithm [12.491024918270824]
We propose a novel approach to nonlinear functional regression.
The model is based on the mapping of function data from an infinite-dimensional function space to a finite-dimensional parameter space.
The performance of our knot placement algorithms is shown to be robust in both single-function approximation and multiple-function approximation contexts.
arXiv Detail & Related papers (2024-01-26T16:35:48Z) - Minimizing Quasi-Self-Concordant Functions by Gradient Regularization of
Newton Method [4.62316736194615]
We study the composite convex optimization problems with a Quasi-Self-Concordant smooth component.
This problem class naturally interpolates between classic Self-Concordant functions and functions with Lipschitz continuous Hessian.
We show that for minimizing Quasi-Self-Concordant functions we can use instead the basic Newton Method with Gradient Regularization.
arXiv Detail & Related papers (2023-08-28T17:43:04Z) - Resource-Adaptive Newton's Method for Distributed Learning [16.588456212160928]
This paper introduces a novel and efficient algorithm called RANL, which overcomes the limitations of Newton's method.
Unlike traditional first-order methods, RANL exhibits remarkable independence from the condition number of the problem.
arXiv Detail & Related papers (2023-08-20T04:01:30Z) - Faster Stochastic Variance Reduction Methods for Compositional MiniMax
Optimization [50.10952609321302]
compositional minimax optimization is a pivotal challenge across various machine learning domains.
Current methods of compositional minimax optimization are plagued by sub-optimal complexities or heavy reliance on sizable batch sizes.
This paper introduces a novel method, called Nested STOchastic Recursive Momentum (NSTORM), which can achieve the optimal sample complexity of $O(kappa3 /epsilon3 )$.
arXiv Detail & Related papers (2023-08-18T14:57:21Z) - Rational kernel-based interpolation for complex-valued frequency response functions [0.0]
This work is concerned with the kernel-based approximation of a complex-valued function from data.
We introduce new Hilbert kernel spaces of complex-valued functions, and formulate the problem of complex-valued with a kernel pair as minimum norm in these spaces.
Numerical results on examples from different fields, including electromagnetics and acoustic examples, illustrate the performance of the method.
arXiv Detail & Related papers (2023-07-25T13:21:07Z) - Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization.
We provide novel excess risk bounds, both in expectation and with high probability.
Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z) - Object Representations as Fixed Points: Training Iterative Refinement
Algorithms with Implicit Differentiation [88.14365009076907]
Iterative refinement is a useful paradigm for representation learning.
We develop an implicit differentiation approach that improves the stability and tractability of training.
arXiv Detail & Related papers (2022-07-02T10:00:35Z) - Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and
Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement.
For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts.
We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z) - Tractable structured natural gradient descent using local
parameterizations [43.51581051770027]
Natural-gradient descent on structured parameter spaces is computationally challenging due to complicated inverse Fisher-matrix computations.
We address this issue by using emphlocal- parameter coordinates.
We show results on a range of applications on deep learning, variational inference, and evolution strategies.
arXiv Detail & Related papers (2021-02-15T09:09:20Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Efficient Methods for Structured Nonconvex-Nonconcave Min-Max
Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point.
The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.