Overview frequency principle/spectral bias in deep learning
- URL: http://arxiv.org/abs/2201.07395v1
- Date: Wed, 19 Jan 2022 03:08:33 GMT
- Title: Overview frequency principle/spectral bias in deep learning
- Authors: Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo
- Abstract summary: We show a Frequency Principle (F-Principle) of the training behavior of deep neural networks (DNNs)
The F-Principle is first demonstrated by one-dimensional synthetic data followed by the verification in high-dimensional real datasets.
This low-frequency implicit bias reveals the strength of neural network in learning low-frequency functions as well as its deficiency in learning high-frequency functions.
- Score: 3.957124094805574
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Understanding deep learning is increasingly emergent as it penetrates more
and more into industry and science. In recent years, a research line from
Fourier analysis sheds lights into this magical "black box" by showing a
Frequency Principle (F-Principle or spectral bias) of the training behavior of
deep neural networks (DNNs) -- DNNs often fit functions from low to high
frequency during the training. The F-Principle is first demonstrated by
one-dimensional synthetic data followed by the verification in high-dimensional
real datasets. A series of works subsequently enhance the validity of the
F-Principle. This low-frequency implicit bias reveals the strength of neural
network in learning low-frequency functions as well as its deficiency in
learning high-frequency functions. Such understanding inspires the design of
DNN-based algorithms in practical problems, explains experimental phenomena
emerging in various scenarios, and further advances the study of deep learning
from the frequency perspective. Although incomplete, we provide an overview of
F-Principle and propose some open problems for future research.
Related papers
- A rationale from frequency perspective for grokking in training neural network [7.264378254137811]
Grokking is the phenomenon where neural networks NNs initially fit the training data and later generalize to the test data during training.
In this paper, we empirically provide a frequency perspective to explain the emergence of this phenomenon in NNs.
arXiv Detail & Related papers (2024-05-24T06:57:23Z) - A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree
Spectral Bias of Neural Networks [79.28094304325116]
Despite the capacity of neural nets to learn arbitrary functions, models trained through gradient descent often exhibit a bias towards simpler'' functions.
We show how this spectral bias towards low-degree frequencies can in fact hurt the neural network's generalization on real-world datasets.
We propose a new scalable functional regularization scheme that aids the neural network to learn higher degree frequencies.
arXiv Detail & Related papers (2023-05-16T20:06:01Z) - Properties and Potential Applications of Random Functional-Linked Types
of Neural Networks [81.56822938033119]
Random functional-linked neural networks (RFLNNs) offer an alternative way of learning in deep structure.
This paper gives some insights into the properties of RFLNNs from the viewpoints of frequency domain.
We propose a method to generate a BLS network with better performance, and design an efficient algorithm for solving Poison's equation.
arXiv Detail & Related papers (2023-04-03T13:25:22Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Momentum Diminishes the Effect of Spectral Bias in Physics-Informed
Neural Networks [72.09574528342732]
Physics-informed neural network (PINN) algorithms have shown promising results in solving a wide range of problems involving partial differential equations (PDEs)
They often fail to converge to desirable solutions when the target function contains high-frequency features, due to a phenomenon known as spectral bias.
In the present work, we exploit neural tangent kernels (NTKs) to investigate the training dynamics of PINNs evolving under gradient descent with momentum (SGDM)
arXiv Detail & Related papers (2022-06-29T19:03:10Z) - Deep Frequency Filtering for Domain Generalization [55.66498461438285]
Deep Neural Networks (DNNs) have preferences for some frequency components in the learning process.
We propose Deep Frequency Filtering (DFF) for learning domain-generalizable features.
We show that applying our proposed DFF on a plain baseline outperforms the state-of-the-art methods on different domain generalization tasks.
arXiv Detail & Related papers (2022-03-23T05:19:06Z) - The Spectral Bias of Polynomial Neural Networks [63.27903166253743]
Polynomial neural networks (PNNs) have been shown to be particularly effective at image generation and face recognition, where high-frequency information is critical.
Previous studies have revealed that neural networks demonstrate a $textitspectral bias$ towards low-frequency functions, which yields faster learning of low-frequency components during training.
Inspired by such studies, we conduct a spectral analysis of the Tangent Kernel (NTK) of PNNs.
We find that the $Pi$-Net family, i.e., a recently proposed parametrization of PNNs, speeds up the
arXiv Detail & Related papers (2022-02-27T23:12:43Z) - Frequency Principle in Deep Learning Beyond Gradient-descent-based
Training [0.0]
Frequency perspective recently makes progress in understanding deep learning.
It has been widely verified that deep neural networks (DNNs) often fit the target function from low to high frequency, namely Frequency Principle (F-Principle)
Previous works examine the F-Principle in gradient-descent-based training.
It remains unclear whether gradient-descent-based training is a necessary condition for the F-Principle.
arXiv Detail & Related papers (2021-01-04T03:11:03Z) - On the exact computation of linear frequency principle dynamics and its
generalization [6.380166265263755]
Recent works show an intriguing phenomenon of Frequency Principle (F-Principle) that fits the target function from low to high frequency during the training.
In this paper, we derive the exact differential equation, namely Linear Frequency-Principle (LFP) model, governing the evolution of NN output function in frequency domain.
arXiv Detail & Related papers (2020-10-15T15:17:21Z) - Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks [9.23835409289015]
We study the training process of Deep Neural Networks (DNNs) from the Fourier analysis perspective.
We demonstrate a very universal Frequency Principle (F-Principle) -- DNNs often fit target functions from low to high frequencies.
arXiv Detail & Related papers (2019-01-19T13:37:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.