Dual Riemannian Newton Method on Statistical Manifolds
- URL: http://arxiv.org/abs/2511.11318v1
- Date: Fri, 14 Nov 2025 13:58:34 GMT
- Title: Dual Riemannian Newton Method on Statistical Manifolds
- Authors: Derun Zhou, Keisuke Yano, Mahito Sugiyama,
- Abstract summary: We propose a Newton-type optimization algorithm on a manifold endowed with a metric and a pair of dual affine connections.<n>We establish local quadratic convergence and validate the theory with experiments on representative statistical models.
- Score: 9.966217183746961
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In probabilistic modeling, parameter estimation is commonly formulated as a minimization problem on a parameter manifold. Optimization in such spaces requires geometry-aware methods that respect the underlying information structure. While the natural gradient leverages the Fisher information metric as a form of Riemannian gradient descent, it remains a first-order method and often exhibits slow convergence near optimal solutions. Existing second-order manifold algorithms typically rely on the Levi-Civita connection, thus overlooking the dual-connection structure that is central to information geometry. We propose the dual Riemannian Newton method, a Newton-type optimization algorithm on manifolds endowed with a metric and a pair of dual affine connections. The dual Riemannian Newton method explicates how duality shapes second-order updates: when the retraction (a local surrogate of the exponential map) is defined by one connection, the associated Newton equation is posed with its dual. We establish local quadratic convergence and validate the theory with experiments on representative statistical models. Thus, the dual Riemannian Newton method thus delivers second-order efficiency while remaining compatible with the dual structures that underlie modern information-geometric learning and inference.
Related papers
- Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods [50.070182958880146]
We propose a unified framework generalizing descent, quasi-Newton methods, and adaptive methods through the novel notion of preconditioned matrix norms.<n>Within this framework, we provide the first systematic treatment of affine and scale invariance in the matrix- parameterized setting.<n>We introduce two new methods, $ttMuAdam$ and $texttMuAdam-SANIA$, which combine the spectral geometry of Muon with Adam-style preconditioning.
arXiv Detail & Related papers (2025-10-12T19:39:41Z) - Riemannian Consistency Model [57.933800575074535]
We propose the Riemannian Consistency Model (RCM), which, for the first time, enables few-step consistency modeling.<n>We derive the closed-form solutions for both discrete- and continuous-time training objectives for RCM.<n>We provide a unique kinematics perspective for interpreting the RCM objective, offering new theoretical angles.
arXiv Detail & Related papers (2025-10-01T14:57:25Z) - Machine learning and optimization-based approaches to duality in statistical physics [2.3727769223905515]
duality is the idea that a given physical system can have two different mathematical descriptions.
We numerically solve the problem and show that our framework can rediscover the celebrated Kramers-Wannier duality for the 2d Ising model.
We also discuss an alternative approach which uses known features of the mapping of topological lines to reduce the problem to optimize the couplings in a dual Hamiltonian.
arXiv Detail & Related papers (2024-11-07T16:29:03Z) - Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence [20.766358513158206]
We prove a global convergence rate of $O(min1/k,sqrtd/k1.25)$ in terms of the duality gap.
These results are the first global convergence results to demonstrate a provable advantage of a quasi-Newton method over the extragradient method.
arXiv Detail & Related papers (2024-10-03T16:08:16Z) - Symplectic Stiefel manifold: tractable metrics, second-order geometry and Newton's methods [1.190653833745802]
We develop explicit second-order geometry and Newton's methods on the symplectic Stiefel manifold.
We then solve the resulting Newton equation, as the central step of Newton's methods.
Various numerical experiments are presented to validate the proposed methods.
arXiv Detail & Related papers (2024-06-20T13:26:06Z) - FORML: A Riemannian Hessian-free Method for Meta-learning on Stiefel Manifolds [4.757859522106933]
This paper introduces a Hessian-free approach that uses a first-order approximation of derivatives on the Stiefel manifold.
Our method significantly reduces the computational load and memory footprint.
arXiv Detail & Related papers (2024-02-28T10:57:30Z) - Decentralized Riemannian Conjugate Gradient Method on the Stiefel
Manifold [59.73080197971106]
This paper presents a first-order conjugate optimization method that converges faster than the steepest descent method.
It aims to achieve global convergence over the Stiefel manifold.
arXiv Detail & Related papers (2023-08-21T08:02:16Z) - Decentralized Riemannian natural gradient methods with Kronecker-product
approximations [11.263837420265594]
We present an efficient decentralized natural gradient descent (DRNGD) method for solving decentralized manifold optimization problems.
By performing the communications over the Kronecker factors, a high-quality approximation of the RFIM can be obtained in a low cost.
arXiv Detail & Related papers (2023-03-16T19:36:31Z) - Explicit Second-Order Min-Max Optimization: Practical Algorithms and Complexity Analysis [71.05708939639537]
We propose and analyze several inexact regularized Newton-type methods for finding a global saddle point of emphconcave unconstrained problems.<n>Our method improves the existing line-search-based min-max optimization by shaving off an $O(loglog(1/eps)$ factor in the required number of Schur decompositions.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - Bayesian Quadrature on Riemannian Data Manifolds [79.71142807798284]
A principled way to model nonlinear geometric structure inherent in data is provided.
However, these operations are typically computationally demanding.
In particular, we focus on Bayesian quadrature (BQ) to numerically compute integrals over normal laws.
We show that by leveraging both prior knowledge and an active exploration scheme, BQ significantly reduces the number of required evaluations.
arXiv Detail & Related papers (2021-02-12T17:38:04Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Enhance Curvature Information by Structured Stochastic Quasi-Newton
Methods [26.712594117460817]
We consider second-order computation methods for minimizing a finite summation of non-linear functions.
Since the true Hessian matrix is often a combination of a cheap part and an expensive part, we propose a structured quasi-Newton convergence method.
Our proposed method is quite competitive to the stateofthe-art methods.
arXiv Detail & Related papers (2020-06-17T02:16:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.