Periodic Activation Functions Induce Stationarity
- URL: http://arxiv.org/abs/2110.13572v1
- Date: Tue, 26 Oct 2021 11:10:37 GMT
- Title: Periodic Activation Functions Induce Stationarity
- Authors: Lassi Meronen, Martin Trapp, Arno Solin
- Abstract summary: We show that periodic activation functions in Bayesian neural networks establish a connection between the prior on the network weights and translation-invariant, stationary Gaussian process priors.
In a series of experiments, we show that periodic activation functions obtain comparable performance for in-domain data and capture sensitivity to perturbed inputs in deep neural networks for out-of-domain detection.
- Score: 19.689175123261613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural network models are known to reinforce hidden data biases, making them
unreliable and difficult to interpret. We seek to build models that `know what
they do not know' by introducing inductive biases in the function space. We
show that periodic activation functions in Bayesian neural networks establish a
connection between the prior on the network weights and translation-invariant,
stationary Gaussian process priors. Furthermore, we show that this link goes
beyond sinusoidal (Fourier) activations by also covering triangular wave and
periodic ReLU activation functions. In a series of experiments, we show that
periodic activation functions obtain comparable performance for in-domain data
and capture sensitivity to perturbed inputs in deep neural networks for
out-of-domain detection.
Related papers
- ENN: A Neural Network with DCT Adaptive Activation Functions [2.2713084727838115]
We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT)
This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks.
The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
arXiv Detail & Related papers (2023-07-02T21:46:30Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - On the Activation Function Dependence of the Spectral Bias of Neural
Networks [0.0]
We study the phenomenon from the point of view of the spectral bias of neural networks.
We provide a theoretical explanation for the spectral bias of ReLU neural networks by leveraging connections with the theory of finite element methods.
We show that neural networks with the Hat activation function are trained significantly faster using gradient descent and ADAM.
arXiv Detail & Related papers (2022-08-09T17:40:57Z) - Momentum Diminishes the Effect of Spectral Bias in Physics-Informed
Neural Networks [72.09574528342732]
Physics-informed neural network (PINN) algorithms have shown promising results in solving a wide range of problems involving partial differential equations (PDEs)
They often fail to converge to desirable solutions when the target function contains high-frequency features, due to a phenomenon known as spectral bias.
In the present work, we exploit neural tangent kernels (NTKs) to investigate the training dynamics of PINNs evolving under gradient descent with momentum (SGDM)
arXiv Detail & Related papers (2022-06-29T19:03:10Z) - Exploring Linear Feature Disentanglement For Neural Networks [63.20827189693117]
Non-linear activation functions, e.g., Sigmoid, ReLU, and Tanh, have achieved great success in neural networks (NNs)
Due to the complex non-linear characteristic of samples, the objective of those activation functions is to project samples from their original feature space to a linear separable feature space.
This phenomenon ignites our interest in exploring whether all features need to be transformed by all non-linear functions in current typical NNs.
arXiv Detail & Related papers (2022-03-22T13:09:17Z) - The Spectral Bias of Polynomial Neural Networks [63.27903166253743]
Polynomial neural networks (PNNs) have been shown to be particularly effective at image generation and face recognition, where high-frequency information is critical.
Previous studies have revealed that neural networks demonstrate a $textitspectral bias$ towards low-frequency functions, which yields faster learning of low-frequency components during training.
Inspired by such studies, we conduct a spectral analysis of the Tangent Kernel (NTK) of PNNs.
We find that the $Pi$-Net family, i.e., a recently proposed parametrization of PNNs, speeds up the
arXiv Detail & Related papers (2022-02-27T23:12:43Z) - Data-Driven Learning of Feedforward Neural Networks with Different
Activation Functions [0.0]
This work contributes to the development of a new data-driven method (D-DM) of feedforward neural networks (FNNs) learning.
arXiv Detail & Related papers (2021-07-04T18:20:27Z) - Activation function design for deep networks: linearity and effective
initialisation [10.108857371774977]
We study how to avoid two problems at initialisation identified in prior works.
We prove that both these problems can be avoided by choosing an activation function possessing a sufficiently large linear region around the origin.
arXiv Detail & Related papers (2021-05-17T11:30:46Z) - And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks.
We define AND-like neurons and propose measures to increase their proportion in the network.
Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z) - Neural Networks Fail to Learn Periodic Functions and How to Fix It [6.230751621285322]
We prove and demonstrate experimentally that the standard activations functions, such as ReLU, tanh, sigmoid, fail to learn to extrapolate simple periodic functions.
We propose a new activation, $x + sin2(x)$, which achieves the desired periodic inductive bias to learn a periodic function.
Experimentally, we apply the proposed method to temperature and financial data prediction.
arXiv Detail & Related papers (2020-06-15T07:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.