Comparing Spectral Bias and Robustness For Two-Layer Neural Networks:
SGD vs Adaptive Random Fourier Features
- URL: http://arxiv.org/abs/2402.00332v1
- Date: Thu, 1 Feb 2024 04:35:37 GMT
- Title: Comparing Spectral Bias and Robustness For Two-Layer Neural Networks:
SGD vs Adaptive Random Fourier Features
- Authors: Aku Kammonen and Lisi Liang and Anamika Pandey and Ra\'ul Tempone
- Abstract summary: We present experimental results highlighting two key differences resulting from the choice of training algorithm for two-layer neural networks.
An adaptive random features algorithm (ARFF) can yield a spectral bias to zero compared to a closer gradient descent (SGD)
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present experimental results highlighting two key differences resulting
from the choice of training algorithm for two-layer neural networks. The
spectral bias of neural networks is well known, while the spectral bias
dependence on the choice of training algorithm is less studied. Our experiments
demonstrate that an adaptive random Fourier features algorithm (ARFF) can yield
a spectral bias closer to zero compared to the stochastic gradient descent
optimizer (SGD). Additionally, we train two identically structured classifiers,
employing SGD and ARFF, to the same accuracy levels and empirically assess
their robustness against adversarial noise attacks.
Related papers
- Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source
Separation [26.6020148790775]
This paper describes an efficient unsupervised learning method for a neural source separation model.
We propose neural FastFCA based on the jointly-diagonalizable yet full-rank spatial model.
Experiment using mixture signals of two to four sound sources shows that neural FastFCA outperforms conventional BSS methods.
arXiv Detail & Related papers (2023-06-17T02:50:17Z) - A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree
Spectral Bias of Neural Networks [79.28094304325116]
Despite the capacity of neural nets to learn arbitrary functions, models trained through gradient descent often exhibit a bias towards simpler'' functions.
We show how this spectral bias towards low-degree frequencies can in fact hurt the neural network's generalization on real-world datasets.
We propose a new scalable functional regularization scheme that aids the neural network to learn higher degree frequencies.
arXiv Detail & Related papers (2023-05-16T20:06:01Z) - Frequency and Scale Perspectives of Feature Extraction [5.081561820537235]
We analyze the sensitivity of neural networks to frequencies and scales.
We find that neural networks have low- and medium-frequency biases but also prefer different frequency bands for different classes.
These observations lead to the hypothesis that neural networks must learn the ability to extract features at various scales and frequencies.
arXiv Detail & Related papers (2023-02-24T06:37:36Z) - Momentum Diminishes the Effect of Spectral Bias in Physics-Informed
Neural Networks [72.09574528342732]
Physics-informed neural network (PINN) algorithms have shown promising results in solving a wide range of problems involving partial differential equations (PDEs)
They often fail to converge to desirable solutions when the target function contains high-frequency features, due to a phenomenon known as spectral bias.
In the present work, we exploit neural tangent kernels (NTKs) to investigate the training dynamics of PINNs evolving under gradient descent with momentum (SGDM)
arXiv Detail & Related papers (2022-06-29T19:03:10Z) - SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with
Adaptive Noise Spectral Shaping [51.698273019061645]
SpecGrad adapts the diffusion noise so that its time-varying spectral envelope becomes close to the conditioning log-mel spectrogram.
It is processed in the time-frequency domain to keep the computational cost almost the same as the conventional DDPM-based neural vocoders.
arXiv Detail & Related papers (2022-03-31T02:08:27Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - An Insect-Inspired Randomly, Weighted Neural Network with Random Fourier
Features For Neuro-Symbolic Relational Learning [2.28438857884398]
We propose a Randomly Weighted Feature Network that incorporates randomly drawn, untrained weights in an encoder that uses an adapted linear model as a decoder.
Because of this special representation, RWFNs can effectively learn the degree of relationship among inputs by training only a linear decoder model.
We demonstrate that compared to LTNs, RWFNs can achieve better or similar performance for both object classification and detection of the part-of relations between objects in SII tasks.
arXiv Detail & Related papers (2021-09-11T22:45:08Z) - Fast Approximate Spectral Normalization for Robust Deep Neural Networks [3.5027291542274357]
We introduce an approximate algorithm for spectral normalization based on Fourier transform and layer separation.
Our framework is able to significantly improve both time efficiency (up to 60%) and model robustness (61% on average) compared with the state-of-the-art spectral normalization.
arXiv Detail & Related papers (2021-03-22T15:35:45Z) - Second-Order Component Analysis for Fault Detection [0.0]
High-order neural networks might bring the risk of overfitting and learning both the key information from original data and noises or anomalies.
This paper proposes a novel fault detection method called second-order component analysis (SCA)
arXiv Detail & Related papers (2021-03-12T14:25:37Z) - Learning Frequency Domain Approximation for Binary Neural Networks [68.79904499480025]
We propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs.
The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy.
arXiv Detail & Related papers (2021-03-01T08:25:26Z) - LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop.
A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.