Related papers: Analysis of Fourier Neural Operators via Effective Field Theory

Analysis of Fourier Neural Operators via Effective Field Theory

URL: http://arxiv.org/abs/2507.21833v1
Date: Tue, 29 Jul 2025 14:10:46 GMT
Title: Analysis of Fourier Neural Operators via Effective Field Theory
Authors: Taeyoung Kim,
Abstract summary: We present the first systematic effective-field-theory analysis of FNOs in an infinite-dimensional function space.<n>We show that nonlinear activations inevitably couple frequency inputs to high-frequency modes that are otherwise discarded by spectral truncation.
Score: 1.7102697561186413
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fourier Neural Operators (FNOs) have emerged as leading surrogates for high-dimensional partial-differential equations, yet their stability, generalization and frequency behavior lack a principled explanation. We present the first systematic effective-field-theory analysis of FNOs in an infinite-dimensional function space, deriving closed recursion relations for the layer kernel and four-point vertex and then examining three practically important settings-analytic activations, scale-invariant cases and architectures with residual connections. The theory shows that nonlinear activations inevitably couple frequency inputs to high-frequency modes that are otherwise discarded by spectral truncation, and experiments confirm this frequency transfer. For wide networks we obtain explicit criticality conditions on the weight-initialization ensemble that keep small input perturbations to have uniform scale across depth, and empirical tests validate these predictions. Taken together, our results quantify how nonlinearity enables neural operators to capture non-trivial features, supply criteria for hyper-parameter selection via criticality analysis, and explain why scale-invariant activations and residual connections enhance feature learning in FNOs.

Related papers

Contraction, Criticality, and Capacity: A Dynamical-Systems Perspective on Echo-State Networks [13.857230672081489]
We present a unified, dynamical-systems treatment that weaves together functional analysis, random attractor theory and recent neuroscientific findings.<n>First, we prove that the Echo-State Property (wash-out of initial conditions) together with global Lipschitz dynamics necessarily yields the Fading-Memory Property.<n>Second, employing a Stone-Weierstrass strategy we give a streamlined proof that ESNs with nonlinear reservoirs and linear read-outs are dense in the Banach space of causal, time-in fading-memory filters.<n>Third, we quantify computational resources via memory-capacity spectrum, show how
arXiv Detail & Related papers (2025-07-24T14:41:18Z)
Neural Contraction Metrics with Formal Guarantees for Discrete-Time Nonlinear Dynamical Systems [17.905596843865705]
Contraction metrics provide a powerful framework for analyzing stability, robustness, and convergence of various dynamical systems.<n>However, identifying these metrics for complex nonlinear systems remains an open challenge due to the lack of effective tools.<n>This paper develops verifiable contraction metrics for discrete scalable nonlinear systems.
arXiv Detail & Related papers (2025-04-23T21:27:32Z)
The Spectral Bias of Shallow Neural Network Learning is Shaped by the Choice of Non-linearity [0.7499722271664144]
We study how non-linear activation functions contribute to shaping neural networks' implicit bias.<n>We show that local dynamical attractors facilitate the formation of clusters of hyperplanes where the input to a neuron's activation function is zero.
arXiv Detail & Related papers (2025-03-13T17:36:46Z)
Muti-Fidelity Prediction and Uncertainty Quantification with Laplace Neural Operators for Parametric Partial Differential Equations [6.03891813540831]
Laplace Neural Operators (LNOs) have emerged as a promising approach in scientific machine learning.<n>We propose multi-fidelity Laplace Neural Operators (MF-LNOs), which combine a low-fidelity base model with parallel linear/nonlinear HF correctors and dynamic inter-fidelity weighting.<n>This allows us to exploit correlations between LF and HF datasets and achieve accurate inference of quantities of interest.
arXiv Detail & Related papers (2025-02-01T20:38:50Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Learning Discretized Neural Networks under Ricci Flow [48.47315844022283]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations.<n>DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z)
Momentum Diminishes the Effect of Spectral Bias in Physics-Informed Neural Networks [72.09574528342732]
Physics-informed neural network (PINN) algorithms have shown promising results in solving a wide range of problems involving partial differential equations (PDEs) They often fail to converge to desirable solutions when the target function contains high-frequency features, due to a phenomenon known as spectral bias. In the present work, we exploit neural tangent kernels (NTKs) to investigate the training dynamics of PINNs evolving under gradient descent with momentum (SGDM)
arXiv Detail & Related papers (2022-06-29T19:03:10Z)
Phenomenology of Double Descent in Finite-Width Neural Networks [29.119232922018732]
Double descent delineates the behaviour of models depending on the regime they belong to. We use influence functions to derive suitable expressions of the population loss and its lower bound. Building on our analysis, we investigate how the loss function affects double descent.
arXiv Detail & Related papers (2022-03-14T17:39:49Z)
Convex Analysis of the Mean Field Langevin Dynamics [49.66486092259375]
convergence rate analysis of the mean field Langevin dynamics is presented. $p_q$ associated with the dynamics allows us to develop a convergence theory parallel to classical results in convex optimization.
arXiv Detail & Related papers (2022-01-25T17:13:56Z)
On Convergence of Training Loss Without Reaching Stationary Points [62.41370821014218]
We show that Neural Network weight variables do not converge to stationary points where the gradient the loss function vanishes. We propose a new perspective based on ergodic theory dynamical systems.
arXiv Detail & Related papers (2021-10-12T18:12:23Z)
Convolutional Filtering and Neural Networks with Non Commutative Algebras [153.20329791008095]
We study the generalization of non commutative convolutional neural networks. We show that non commutative convolutional architectures can be stable to deformations on the space of operators.
arXiv Detail & Related papers (2021-08-23T04:22:58Z)
Designing Kerr Interactions for Quantum Information Processing via Counterrotating Terms of Asymmetric Josephson-Junction Loops [68.8204255655161]
static cavity nonlinearities typically limit the performance of bosonic quantum error-correcting codes. Treating the nonlinearity as a perturbation, we derive effective Hamiltonians using the Schrieffer-Wolff transformation. Results show that a cubic interaction allows to increase the effective rates of both linear and nonlinear operations.
arXiv Detail & Related papers (2021-07-14T15:11:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.