Related papers: Keep the Momentum: Conservation Laws beyond Euclidean Gradient Flows

Keep the Momentum: Conservation Laws beyond Euclidean Gradient Flows

URL: http://arxiv.org/abs/2405.12888v1
Date: Tue, 21 May 2024 15:59:55 GMT
Title: Keep the Momentum: Conservation Laws beyond Euclidean Gradient Flows
Authors: Sibylle Marcotte, Rémi Gribonval, Gabriel Peyré,
Abstract summary: We show that conservation laws for momentum-based dynamics exhibit temporal dependence. We also observe a "conservation loss" when transitioning from gradient flow to momentum dynamics. This phenomenon also manifests in non-Euclidean metrics, used e.g. for Nonnegative Matrix Factorization (NMF)
Score: 28.287184613608435
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Conservation laws are well-established in the context of Euclidean gradient flow dynamics, notably for linear or ReLU neural network training. Yet, their existence and principles for non-Euclidean geometries and momentum-based dynamics remain largely unknown. In this paper, we characterize "all" conservation laws in this general setting. In stark contrast to the case of gradient flows, we prove that the conservation laws for momentum-based dynamics exhibit temporal dependence. Additionally, we often observe a "conservation loss" when transitioning from gradient flow to momentum dynamics. Specifically, for linear networks, our framework allows us to identify all momentum conservation laws, which are less numerous than in the gradient flow case except in sufficiently over-parameterized regimes. With ReLU networks, no conservation law remains. This phenomenon also manifests in non-Euclidean metrics, used e.g. for Nonnegative Matrix Factorization (NMF): all conservation laws can be determined in the gradient flow context, yet none persists in the momentum case.

Related papers

Transformative or Conservative? Conservation laws for ResNets and Transformers [28.287184613608435]
This paper bridges the gap by deriving and analyzing conservation laws for modern architectures.<n>We first show that basic building blocks such as ReLU (or linear) shallow networks, with or without convolution, have easily expressed conservation laws.<n>We then introduce the notion of conservation laws that depend only on a subset of parameters.
arXiv Detail & Related papers (2025-06-06T15:53:35Z)
Generative System Dynamics in Recurrent Neural Networks [56.958984970518564]
We investigate the continuous time dynamics of Recurrent Neural Networks (RNNs) We show that skew-symmetric weight matrices are fundamental to enable stable limit cycles in both linear and nonlinear configurations. Numerical simulations showcase how nonlinear activation functions not only maintain limit cycles, but also enhance the numerical stability of the system integration process.
arXiv Detail & Related papers (2025-04-16T10:39:43Z)
Dynamical freezing in the thermodynamic limit: the strongly driven ensemble [37.31317754926534]
A periodically driven (Floquet) system in the absence of any conservation law heats to a featureless infinite temperature' state. Here, we find--for a clean and interacting generic spin chain--that this can be prevented by the emergence of it approximate but stable conservation-laws not present in the undriven system. We show numerically, it in the thermodynamic limit,' that when required by these emergent conservation-laws, the entanglement-entropy density of an infinite subsystem remains zero.
arXiv Detail & Related papers (2024-10-14T19:57:43Z)
On the Convergence of Gradient Descent for Large Learning Rates [55.33626480243135]
We show that convergence is impossible when a fixed step size is used. We provide a proof of this in the case of linear neural networks with a squared loss. We also prove the impossibility of convergence for more general losses without requiring strong assumptions such as Lipschitz continuity for the gradient.
arXiv Detail & Related papers (2024-02-20T16:01:42Z)
Tipping Point Forecasting in Non-Stationary Dynamics on Function Spaces [78.08947381962658]
Tipping points are abrupt, drastic, and often irreversible changes in the evolution of non-stationary dynamical systems. We learn the evolution of such non-stationary systems using a novel recurrent neural operator (RNO), which learns mappings between function spaces. We propose a conformal prediction framework to forecast tipping points by monitoring deviations from physics constraints.
arXiv Detail & Related papers (2023-08-17T05:42:27Z)
Reaction-diffusive dynamics of number-conserving dissipative quantum state preparation [0.0]
We show the emergence of a diffusive regime for the particle and hole density modes at intermediate length- and time-scales. We also identify processes that limit the diffusive behavior of this mode at the longest length- and time-scales. Strikingly, we find that these processes lead to a reaction-diffusion dynamics governed by the Fisher-Kolmogorov-Petrovsky-Piskunov equation.
arXiv Detail & Related papers (2023-01-12T19:11:04Z)
Limitations of Quantum Measurements and Operations of Scattering Type under the Energy Conservation Law [0.0]
We show that the achievable accuracy of measurements and unitary operations are generally limited by conservation laws. We present a lower bound for the error of a quantum measurement using a scattering process satisfying the energy conservation law. We also show the quantitative relationship between the upper bound of the gate fidelity of a controlled unitary gate and the energy fluctuation of systems.
arXiv Detail & Related papers (2022-11-24T06:28:57Z)
Support Vectors and Gradient Dynamics for Implicit Bias in ReLU Networks [45.886537625951256]
We study gradient flow dynamics in the parameter space when training single-neuron ReLU networks. Specifically, we discover implicit bias in terms of support vectors in ReLU networks, which play a key role in why and how ReLU networks generalize well.
arXiv Detail & Related papers (2022-02-11T08:55:58Z)
On Convergence of Training Loss Without Reaching Stationary Points [62.41370821014218]
We show that Neural Network weight variables do not converge to stationary points where the gradient the loss function vanishes. We propose a new perspective based on ergodic theory dynamical systems.
arXiv Detail & Related papers (2021-10-12T18:12:23Z)
Statistical Mechanics of Floquet Quantum Matter: Exact and Emergent Conservation Laws [0.0]
More recently, it has been shown that the statistical mechanics has a much richer structure due to the existence of it emergent conservation laws. This review intends to give a theoretical overview of these developments.
arXiv Detail & Related papers (2021-05-21T16:50:49Z)
Value Iteration in Continuous Actions, States and Time [99.00362538261972]
We propose a continuous fitted value iteration (cFVI) algorithm for continuous states and actions. The optimal policy can be derived for non-linear control-affine dynamics. Videos of the physical system are available at urlhttps://sites.google.com/view/value-iteration.
arXiv Detail & Related papers (2021-05-10T21:40:56Z)
On dissipative symplectic integration with applications to gradient-based optimization [77.34726150561087]
We propose a geometric framework in which discretizations can be realized systematically. We show that a generalization of symplectic to nonconservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error.
arXiv Detail & Related papers (2020-04-15T00:36:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.