Momentum-based minimization of the Ginzburg-Landau functional on   Euclidean spaces and graphs
        - URL: http://arxiv.org/abs/2501.00389v1
- Date: Tue, 31 Dec 2024 11:05:49 GMT
- Title: Momentum-based minimization of the Ginzburg-Landau functional on   Euclidean spaces and graphs
- Authors: Oluwatosin Akande, Patrick Dondl, Kanan Gupta, Akwum Onwunta, Stephan Wojtowytsch, 
- Abstract summary: We study the momentum-based minimization of a diffuse perimeter functional on Euclidean spaces and on graphs with applications to semi-supervised classification tasks in machine learning.<n>We demonstrate empirically that momentum can lead to faster convergence if the time step size is large but not too large.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   We study the momentum-based minimization of a diffuse perimeter functional on Euclidean spaces and on graphs with applications to semi-supervised classification tasks in machine learning. While the gradient flow in the task at hand is a parabolic partial differential equation, the momentum-method corresponds to a damped hyperbolic PDE, leading to qualitatively and quantitatively different trajectories. Using a convex-concave splitting-based FISTA-type time discretization, we demonstrate empirically that momentum can lead to faster convergence if the time step size is large but not too large. With large time steps, the PDE analysis offers only limited insight into the geometric behavior of solutions and typical hyperbolic phenomena like loss of regularity are not be observed in sample simulations. 
 
      
        Related papers
        - MultiPDENet: PDE-embedded Learning with Multi-time-stepping for   Accelerated Flow Simulation [48.41289705783405]
 We propose a PDE-embedded network with multiscale time stepping (MultiPDENet)
In particular, we design a convolutional filter based on the structure of finite difference with a small number of parameters to optimize.
A Physics Block with a 4th-order Runge-Kutta integrator at the fine time scale is established that embeds the structure of PDEs to guide the prediction.
 arXiv  Detail & Related papers  (2025-01-27T12:15:51Z)
- A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for   Functional Minimax Optimization [90.87444114491116]
 This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
 arXiv  Detail & Related papers  (2024-04-18T16:46:08Z)
- Convergence of mean-field Langevin dynamics: Time and space
  discretization, stochastic gradient, and variance reduction [49.66486092259376]
 The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
 arXiv  Detail & Related papers  (2023-06-12T16:28:11Z)
- Implicit Bias of Gradient Descent for Logistic Regression at the Edge of
  Stability [69.01076284478151]
 In machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS)
This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime.
 arXiv  Detail & Related papers  (2023-05-19T16:24:47Z)
- High-dimensional scaling limits and fluctuations of online least-squares   SGD with smooth covariance [16.652085114513273]
 We derive high-dimensional scaling limits and fluctuations for the online least-squares Gradient Descent (SGD) algorithm.
Our results have several applications, including characterization of the limiting mean-square estimation or prediction errors and their fluctuations.
 arXiv  Detail & Related papers  (2023-04-03T03:50:00Z)
- Self-Consistent Velocity Matching of Probability Flows [22.2542921090435]
 We present a discretization-free scalable framework for solving a class of partial differential equations (PDEs)
The main observation is that the time-varying velocity field of the PDE solution needs to be self-consistent.
We use an iterative formulation with a biased gradient estimator that bypasses significant computational obstacles with strong empirical performance.
 arXiv  Detail & Related papers  (2023-01-31T16:17:18Z)
- A Self-supervised Riemannian GNN with Time Varying Curvature for
  Temporal Graph Learning [79.20249985327007]
 We present a novel self-supervised Riemannian graph neural network (SelfRGNN)
Specifically, we design a curvature-varying GNN with a theoretically grounded time encoding, and formulate a functional curvature over time to model the evolvement shifting among the positive, zero and negative curvature spaces.
Extensive experiments show the superiority of SelfRGNN, and moreover, the case study shows the time-varying curvature of temporal graph in reality.
 arXiv  Detail & Related papers  (2022-08-30T08:43:06Z)
- Semi-supervised Learning of Partial Differential Operators and Dynamical
  Flows [68.77595310155365]
 We present a novel method that combines a hyper-network solver with a Fourier Neural Operator architecture.
We test our method on various time evolution PDEs, including nonlinear fluid flows in one, two, and three spatial dimensions.
The results show that the new method improves the learning accuracy at the time point of supervision point, and is able to interpolate and the solutions to any intermediate time.
 arXiv  Detail & Related papers  (2022-07-28T19:59:14Z)
- High-dimensional limit theorems for SGD: Effective dynamics and critical
  scaling [6.950316788263433]
 We prove limit theorems for the trajectories of summary statistics of gradient descent (SGD)
We show a critical scaling regime for the step-size, below which the effective ballistic dynamics matches gradient flow for the population loss.
About the fixed points of this effective dynamics, the corresponding diffusive limits can be quite complex and even degenerate.
 arXiv  Detail & Related papers  (2022-06-08T17:42:18Z)
- Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
  Problems [98.34292831923335]
 Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
 arXiv  Detail & Related papers  (2021-12-29T18:46:52Z)
- On Large Batch Training and Sharp Minima: A Fokker-Planck Perspective [0.0]
 We study the statistical properties of the dynamic trajectory of gradient descent (SGD)
We exploit the continuous formulation of SDE and the theory of Fokker-Planck equations to develop new results on escaping phenomenon and relationship with large batch and sharp minima.
 arXiv  Detail & Related papers  (2021-12-02T05:24:05Z)
- Model reduction for the material point method via learning the
  deformation map and its spatial-temporal gradients [9.509644638212773]
 The technique approximates the $textitkinematics$ by approximating the deformation map in a manner that restricts deformation trajectories to reside on a low-dimensional manifold.
The ability to generate material points also allows for adaptive quadrature rules for stress update.
 arXiv  Detail & Related papers  (2021-09-25T15:45:14Z)
- The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations,
  and Anomalous Diffusion [29.489737359897312]
 We study the limiting dynamics of deep neural networks trained with gradient descent (SGD)
We show that the key ingredient driving these dynamics is not the original training loss, but rather the combination of a modified loss, which implicitly regularizes the velocity and probability currents, which cause oscillations in phase space.
 arXiv  Detail & Related papers  (2021-07-19T20:18:57Z)
- Solving PDEs on Unknown Manifolds with Machine Learning [8.220217498103315]
 This paper presents a mesh-free computational framework and machine learning theory for solving elliptic PDEs on unknown manifold.
We show that the proposed NN solver can robustly generalize the PDE on new data points with errors that are almost identical to generalizations on new data points.
 arXiv  Detail & Related papers  (2021-06-12T03:55:15Z)
- DiffPD: Differentiable Projective Dynamics with Contact [65.88720481593118]
 We present DiffPD, an efficient differentiable soft-body simulator with implicit time integration.
We evaluate the performance of DiffPD and observe a speedup of 4-19 times compared to the standard Newton's method in various applications.
 arXiv  Detail & Related papers  (2021-01-15T00:13:33Z)
- A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
 We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
 arXiv  Detail & Related papers  (2019-10-31T02:26:20Z)
- Fast approximations in the homogeneous Ising model for use in scene
  analysis [61.0951285821105]
 We provide accurate approximations that make it possible to numerically calculate quantities needed in inference.
We show that our approximation formulae are scalable and unfazed by the size of the Markov Random Field.
The practical import of our approximation formulae is illustrated in performing Bayesian inference in a functional Magnetic Resonance Imaging activation detection experiment, and also in likelihood ratio testing for anisotropy in the spatial patterns of yearly increases in pistachio tree yields.
 arXiv  Detail & Related papers  (2017-12-06T14:24:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.