Understanding the Spectral Bias of Coordinate Based MLPs Via Training
Dynamics
- URL: http://arxiv.org/abs/2301.05816v4
- Date: Thu, 4 May 2023 01:46:24 GMT
- Title: Understanding the Spectral Bias of Coordinate Based MLPs Via Training
Dynamics
- Authors: John Lazzari, Xiuwen Liu
- Abstract summary: We study the connection between the computations of ReLU networks, and the speed of gradient descent convergence.
We then use this formulation to study the severity of spectral bias in low dimensional settings, and how positional encoding overcomes this.
- Score: 2.9443230571766854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spectral bias is an important observation of neural network training, stating
that the network will learn a low frequency representation of the target
function before converging to higher frequency components. This property is
interesting due to its link to good generalization in over-parameterized
networks. However, in low dimensional settings, a severe spectral bias occurs
that obstructs convergence to high frequency components entirely. In order to
overcome this limitation, one can encode the inputs using a high frequency
sinusoidal encoding. Previous works attempted to explain this phenomenon using
Neural Tangent Kernel (NTK) and Fourier analysis. However, NTK does not capture
real network dynamics, and Fourier analysis only offers a global perspective on
the network properties that induce this bias. In this paper, we provide a novel
approach towards understanding spectral bias by directly studying ReLU MLP
training dynamics. Specifically, we focus on the connection between the
computations of ReLU networks (activation regions), and the speed of gradient
descent convergence. We study these dynamics in relation to the spatial
information of the signal to understand how they influence spectral bias. We
then use this formulation to study the severity of spectral bias in low
dimensional settings, and how positional encoding overcomes this.
Related papers
- A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree
Spectral Bias of Neural Networks [79.28094304325116]
Despite the capacity of neural nets to learn arbitrary functions, models trained through gradient descent often exhibit a bias towards simpler'' functions.
We show how this spectral bias towards low-degree frequencies can in fact hurt the neural network's generalization on real-world datasets.
We propose a new scalable functional regularization scheme that aids the neural network to learn higher degree frequencies.
arXiv Detail & Related papers (2023-05-16T20:06:01Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Momentum Diminishes the Effect of Spectral Bias in Physics-Informed
Neural Networks [72.09574528342732]
Physics-informed neural network (PINN) algorithms have shown promising results in solving a wide range of problems involving partial differential equations (PDEs)
They often fail to converge to desirable solutions when the target function contains high-frequency features, due to a phenomenon known as spectral bias.
In the present work, we exploit neural tangent kernels (NTKs) to investigate the training dynamics of PINNs evolving under gradient descent with momentum (SGDM)
arXiv Detail & Related papers (2022-06-29T19:03:10Z) - Overcoming the Spectral Bias of Neural Value Approximation [17.546011419043644]
Value approximation using deep neural networks is often the primary module that provides learning signals to the rest of the algorithm.
Recent works in neural kernel regression suggest the presence of a spectral bias, where fitting high-frequency components of the value function requires exponentially more gradient update steps than the low-frequency ones.
We re-examine off-policy reinforcement learning through the lens of kernel regression and propose to overcome such bias via a composite neural kernel.
arXiv Detail & Related papers (2022-06-09T17:59:57Z) - The Spectral Bias of Polynomial Neural Networks [63.27903166253743]
Polynomial neural networks (PNNs) have been shown to be particularly effective at image generation and face recognition, where high-frequency information is critical.
Previous studies have revealed that neural networks demonstrate a $textitspectral bias$ towards low-frequency functions, which yields faster learning of low-frequency components during training.
Inspired by such studies, we conduct a spectral analysis of the Tangent Kernel (NTK) of PNNs.
We find that the $Pi$-Net family, i.e., a recently proposed parametrization of PNNs, speeds up the
arXiv Detail & Related papers (2022-02-27T23:12:43Z) - Spectral Complexity-scaled Generalization Bound of Complex-valued Neural
Networks [78.64167379726163]
This paper is the first work that proves a generalization bound for the complex-valued neural network.
We conduct experiments by training complex-valued convolutional neural networks on different datasets.
arXiv Detail & Related papers (2021-12-07T03:25:25Z) - Understanding Layer-wise Contributions in Deep Neural Networks through
Spectral Analysis [6.0158981171030685]
We analyze the layer-wise spectral bias of Deep Neural Networks and relate it to the contributions of different layers in the reduction of error for a given target function.
We provide empirical results validating our theory in high dimensional datasets for Deep Neural Networks.
arXiv Detail & Related papers (2021-11-06T22:49:46Z) - Spectral Bias in Practice: The Role of Function Frequency in
Generalization [10.7218588164913]
We propose methodologies for measuring spectral bias in modern image classification networks.
We find that networks that generalize well strike a balance between having enough complexity to fit the data while being simple enough to avoid overfitting.
Our work enables measuring and ultimately controlling the spectral behavior of neural networks used for image classification.
arXiv Detail & Related papers (2021-10-06T00:16:10Z) - Fourier Features Let Networks Learn High Frequency Functions in Low
Dimensional Domains [69.62456877209304]
We show that passing input points through a simple Fourier feature mapping enables a multilayer perceptron to learn high-frequency functions.
Results shed light on advances in computer vision and graphics that achieve state-of-the-art results.
arXiv Detail & Related papers (2020-06-18T17:59:11Z) - Frequency Bias in Neural Networks for Input of Non-Uniform Density [27.75835200173761]
We use the Neural Tangent Kernel (NTK) model to explore the effect of variable density on training dynamics.
Our results show convergence at a point $x in Sphered-1$ occurs in time $O(kappad/p(x))$ where $p(x)$ denotes the local density at $x$.
arXiv Detail & Related papers (2020-03-10T07:20:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.