Periodic Extrapolative Generalisation in Neural Networks
- URL: http://arxiv.org/abs/2209.10280v1
- Date: Wed, 21 Sep 2022 11:47:30 GMT
- Title: Periodic Extrapolative Generalisation in Neural Networks
- Authors: Peter Belc\'ak, Roger Wattenhofer
- Abstract summary: We formalise the problem of extrapolative generalisation for periodic signals.
We investigate the generalisation abilities of classical, population-based, and recently proposed periodic architectures.
We find that periodic and "snake" activation functions consistently fail at periodic extrapolation.
- Score: 10.482805367361818
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The learning of the simplest possible computational pattern -- periodicity --
is an open problem in the research of strong generalisation in neural networks.
We formalise the problem of extrapolative generalisation for periodic signals
and systematically investigate the generalisation abilities of classical,
population-based, and recently proposed periodic architectures on a set of
benchmarking tasks. We find that periodic and "snake" activation functions
consistently fail at periodic extrapolation, regardless of the trainability of
their periodicity parameters. Further, our results show that traditional
sequential models still outperform the novel architectures designed
specifically for extrapolation, and that these are in turn trumped by
population-based training. We make our benchmarking and evaluation toolkit,
PerKit, available and easily accessible to facilitate future work in the area.
Related papers
- Exact, Tractable Gauss-Newton Optimization in Deep Reversible Architectures Reveal Poor Generalization [52.16435732772263]
Second-order optimization has been shown to accelerate the training of deep neural networks in many applications.
However, generalization properties of second-order methods are still being debated.
We show for the first time that exact Gauss-Newton (GN) updates take on a tractable form in a class of deep architectures.
arXiv Detail & Related papers (2024-11-12T17:58:40Z) - Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms [15.473123662393169]
Deep neural networks (DNNs) show remarkable generalization properties.
The source of these capabilities remains elusive, defying the established statistical learning theory.
Recent studies have revealed that properties of training trajectories can be indicative of generalization.
arXiv Detail & Related papers (2024-07-11T17:56:03Z) - A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection [98.41798478488101]
Time series analytics is crucial to unlocking the wealth of information implicit in available data.
Recent advancements in graph neural networks (GNNs) have led to a surge in GNN-based approaches for time series analysis.
This survey brings together a vast array of knowledge on GNN-based time series research, highlighting foundations, practical applications, and opportunities of graph neural networks for time series analysis.
arXiv Detail & Related papers (2023-07-07T08:05:03Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - TANGOS: Regularizing Tabular Neural Networks through Gradient
Orthogonalization and Specialization [69.80141512683254]
We introduce Tabular Neural Gradient Orthogonalization and gradient (TANGOS)
TANGOS is a novel framework for regularization in the tabular setting built on latent unit attributions.
We demonstrate that our approach can lead to improved out-of-sample generalization performance, outperforming other popular regularization methods.
arXiv Detail & Related papers (2023-03-09T18:57:13Z) - Taming Local Effects in Graph-based Spatiotemporal Forecasting [28.30604130617646]
Stemporal graph neural networks have shown to be effective in time series forecasting applications.
This paper aims to understand the interplay between globality and locality in graph-basedtemporal forecasting.
We propose a methodological framework to rationalize the practice of including trainable node embeddings in such architectures.
arXiv Detail & Related papers (2023-02-08T14:18:56Z) - The Spectral Bias of Polynomial Neural Networks [63.27903166253743]
Polynomial neural networks (PNNs) have been shown to be particularly effective at image generation and face recognition, where high-frequency information is critical.
Previous studies have revealed that neural networks demonstrate a $textitspectral bias$ towards low-frequency functions, which yields faster learning of low-frequency components during training.
Inspired by such studies, we conduct a spectral analysis of the Tangent Kernel (NTK) of PNNs.
We find that the $Pi$-Net family, i.e., a recently proposed parametrization of PNNs, speeds up the
arXiv Detail & Related papers (2022-02-27T23:12:43Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Neural Networks Fail to Learn Periodic Functions and How to Fix It [6.230751621285322]
We prove and demonstrate experimentally that the standard activations functions, such as ReLU, tanh, sigmoid, fail to learn to extrapolate simple periodic functions.
We propose a new activation, $x + sin2(x)$, which achieves the desired periodic inductive bias to learn a periodic function.
Experimentally, we apply the proposed method to temperature and financial data prediction.
arXiv Detail & Related papers (2020-06-15T07:49:33Z) - AL2: Progressive Activation Loss for Learning General Representations in
Classification Neural Networks [12.14537824884951]
We propose a novel regularization method that progressively penalizes the magnitude of activations during training.
Our method's effect on generalization is analyzed with label randomization tests and cumulative ablations.
arXiv Detail & Related papers (2020-03-07T18:38:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.