Momentum-based minimization of the Ginzburg-Landau functional on Euclidean spaces and graphs
- URL: http://arxiv.org/abs/2501.00389v1
- Date: Tue, 31 Dec 2024 11:05:49 GMT
- Title: Momentum-based minimization of the Ginzburg-Landau functional on Euclidean spaces and graphs
- Authors: Oluwatosin Akande, Patrick Dondl, Kanan Gupta, Akwum Onwunta, Stephan Wojtowytsch,
- Abstract summary: We study the momentum-based minimization of a diffuse perimeter functional on Euclidean spaces and on graphs with applications to semi-supervised classification tasks in machine learning.
We demonstrate empirically that momentum can lead to faster convergence if the time step size is large but not too large.
- Score: 0.0
- License:
- Abstract: We study the momentum-based minimization of a diffuse perimeter functional on Euclidean spaces and on graphs with applications to semi-supervised classification tasks in machine learning. While the gradient flow in the task at hand is a parabolic partial differential equation, the momentum-method corresponds to a damped hyperbolic PDE, leading to qualitatively and quantitatively different trajectories. Using a convex-concave splitting-based FISTA-type time discretization, we demonstrate empirically that momentum can lead to faster convergence if the time step size is large but not too large. With large time steps, the PDE analysis offers only limited insight into the geometric behavior of solutions and typical hyperbolic phenomena like loss of regularity are not be observed in sample simulations.
Related papers
- MultiPDENet: PDE-embedded Learning with Multi-time-stepping for Accelerated Flow Simulation [48.41289705783405]
We propose a PDE-embedded network with multiscale time stepping (MultiPDENet)
In particular, we design a convolutional filter based on the structure of finite difference with a small number of parameters to optimize.
A Physics Block with a 4th-order Runge-Kutta integrator at the fine time scale is established that embeds the structure of PDEs to guide the prediction.
arXiv Detail & Related papers (2025-01-27T12:15:51Z) - Convergence of mean-field Langevin dynamics: Time and space
discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z) - Implicit Bias of Gradient Descent for Logistic Regression at the Edge of
Stability [69.01076284478151]
In machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS)
This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime.
arXiv Detail & Related papers (2023-05-19T16:24:47Z) - Self-Consistent Velocity Matching of Probability Flows [22.2542921090435]
We present a discretization-free scalable framework for solving a class of partial differential equations (PDEs)
The main observation is that the time-varying velocity field of the PDE solution needs to be self-consistent.
We use an iterative formulation with a biased gradient estimator that bypasses significant computational obstacles with strong empirical performance.
arXiv Detail & Related papers (2023-01-31T16:17:18Z) - Semi-supervised Learning of Partial Differential Operators and Dynamical
Flows [68.77595310155365]
We present a novel method that combines a hyper-network solver with a Fourier Neural Operator architecture.
We test our method on various time evolution PDEs, including nonlinear fluid flows in one, two, and three spatial dimensions.
The results show that the new method improves the learning accuracy at the time point of supervision point, and is able to interpolate and the solutions to any intermediate time.
arXiv Detail & Related papers (2022-07-28T19:59:14Z) - On Large Batch Training and Sharp Minima: A Fokker-Planck Perspective [0.0]
We study the statistical properties of the dynamic trajectory of gradient descent (SGD)
We exploit the continuous formulation of SDE and the theory of Fokker-Planck equations to develop new results on escaping phenomenon and relationship with large batch and sharp minima.
arXiv Detail & Related papers (2021-12-02T05:24:05Z) - The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations,
and Anomalous Diffusion [29.489737359897312]
We study the limiting dynamics of deep neural networks trained with gradient descent (SGD)
We show that the key ingredient driving these dynamics is not the original training loss, but rather the combination of a modified loss, which implicitly regularizes the velocity and probability currents, which cause oscillations in phase space.
arXiv Detail & Related papers (2021-07-19T20:18:57Z) - Solving PDEs on Unknown Manifolds with Machine Learning [8.220217498103315]
This paper presents a mesh-free computational framework and machine learning theory for solving elliptic PDEs on unknown manifold.
We show that the proposed NN solver can robustly generalize the PDE on new data points with errors that are almost identical to generalizations on new data points.
arXiv Detail & Related papers (2021-06-12T03:55:15Z) - DiffPD: Differentiable Projective Dynamics with Contact [65.88720481593118]
We present DiffPD, an efficient differentiable soft-body simulator with implicit time integration.
We evaluate the performance of DiffPD and observe a speedup of 4-19 times compared to the standard Newton's method in various applications.
arXiv Detail & Related papers (2021-01-15T00:13:33Z) - Fast approximations in the homogeneous Ising model for use in scene
analysis [61.0951285821105]
We provide accurate approximations that make it possible to numerically calculate quantities needed in inference.
We show that our approximation formulae are scalable and unfazed by the size of the Markov Random Field.
The practical import of our approximation formulae is illustrated in performing Bayesian inference in a functional Magnetic Resonance Imaging activation detection experiment, and also in likelihood ratio testing for anisotropy in the spatial patterns of yearly increases in pistachio tree yields.
arXiv Detail & Related papers (2017-12-06T14:24:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.