Related papers: Tuning the activation function to optimize the forecast horizon of a reservoir computer

Tuning the activation function to optimize the forecast horizon of a reservoir computer

URL: http://arxiv.org/abs/2312.13151v1
Date: Wed, 20 Dec 2023 16:16:01 GMT
Title: Tuning the activation function to optimize the forecast horizon of a reservoir computer
Authors: Lauren A. Hurley, Juan G. Restrepo, Sean E. Shaheen
Abstract summary: We study the effect of the node activation function on the ability of reservoir computers to learn and predict chaotic time series. We find that the Forecast Horizon (FH), the time during which the reservoir's predictions remain accurate, can vary by an order of magnitude across a set of 16 activation functions.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reservoir computing is a machine learning framework where the readouts from a nonlinear system (the reservoir) are trained so that the output from the reservoir, when forced with an input signal, reproduces a desired output signal. A common implementation of reservoir computers is to use a recurrent neural network as the reservoir. The design of this network can have significant effects on the performance of the reservoir computer. In this paper we study the effect of the node activation function on the ability of reservoir computers to learn and predict chaotic time series. We find that the Forecast Horizon (FH), the time during which the reservoir's predictions remain accurate, can vary by an order of magnitude across a set of 16 activation functions used in machine learning. By using different functions from this set, and by modifying their parameters, we explore whether the entropy of node activation levels or the curvature of the activation functions determine the predictive ability of the reservoirs. We find that the FH is low when the activation function is used in a region where it has low curvature, and a positive correlation between curvature and FH. For the activation functions studied we find that the largest FH generally occurs at intermediate levels of the entropy of node activation levels. Our results show that the performance of reservoir computers is very sensitive to the activation function shape. Therefore, modifying this shape in hyperparameter optimization algorithms can lead to improvements in reservoir computer performance.

Related papers

Deterministic Reservoir Computing for Chaotic Time Series Prediction [5.261277318790788]
We propose a deterministic alternative to the higher-dimensional mapping therein, TCRC-LM and TCRC-CM. To further enhance the predictive capabilities in the task of time series forecasting, we propose the novel utilization of the Lobachevsky function as non-linear activation function.
arXiv Detail & Related papers (2025-01-26T17:46:40Z)
ENN: A Neural Network with DCT Adaptive Activation Functions [2.2713084727838115]
We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT) This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks. The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
arXiv Detail & Related papers (2023-07-02T21:46:30Z)
Optimization of a Hydrodynamic Computational Reservoir through Evolution [58.720142291102135]
We interface with a model of a hydrodynamic system, under development by a startup, as a computational reservoir. We optimized the readout times and how inputs are mapped to the wave amplitude or frequency using an evolutionary search algorithm. Applying evolutionary methods to this reservoir system substantially improved separability on an XNOR task, in comparison to implementations with hand-selected parameters.
arXiv Detail & Related papers (2023-04-20T19:15:02Z)
Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks. We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z)
Time-shift selection for reservoir computing using a rank-revealing QR algorithm [0.0]
We present a technique to choose the time-shifts by maximizing the rank of the reservoir matrix using a rank-revealing QR algorithm. We find that our technique provides improved accuracy over random time-shift selection in essentially all cases.
arXiv Detail & Related papers (2022-11-29T14:55:51Z)
Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data. RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z)
Convolutional generative adversarial imputation networks for spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods. We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z)
Otimizacao de pesos e funcoes de ativacao de redes neurais aplicadas na previsao de series temporais [0.0]
We propose the use of a family of free parameter asymmetric activation functions for neural networks. We show that this family of defined activation functions satisfies the requirements of the universal approximation theorem. A methodology for the global optimization of this family of activation functions with free parameter and the weights of the connections between the processing units of the neural network is used.
arXiv Detail & Related papers (2021-07-29T23:32:15Z)
Compressing Deep ODE-Nets using Basis Function Expansions [105.05435207079759]
We consider formulations of the weights as continuous-depth functions using linear combinations of basis functions. This perspective allows us to compress the weights through a change of basis, without retraining, while maintaining near state-of-the-art performance. In turn, both inference time and the memory footprint are reduced, enabling quick and rigorous adaptation between computational environments.
arXiv Detail & Related papers (2021-06-21T03:04:51Z)
Estimating Multiplicative Relations in Neural Networks [0.0]
We will use properties of logarithmic functions to propose a pair of activation functions which can translate products into linear expression and learn using backpropagation. We will try to generalize this approach for some complex arithmetic functions and test the accuracy on a disjoint distribution with the training set.
arXiv Detail & Related papers (2020-10-28T14:28:24Z)
Space of Functions Computed by Deep-Layered Machines [74.13735716675987]
We study the space of functions computed by random-layered machines, including deep neural networks and Boolean circuits. Investigating the distribution of Boolean functions computed on the recurrent and layer-dependent architectures, we find that it is the same in both models.
arXiv Detail & Related papers (2020-04-19T18:31:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.