Separation capacity of linear reservoirs with random connectivity matrix
- URL: http://arxiv.org/abs/2404.17429v3
- Date: Fri, 21 Mar 2025 03:21:08 GMT
- Title: Separation capacity of linear reservoirs with random connectivity matrix
- Authors: Youness Boutaib,
- Abstract summary: We quantify the capacity for random linear reservoirs to map different input time series to separable reservoir states.<n>In the i.i.d. case, we establish that optimal separation with large reservoirs is consistently achieved when the entries of the reservoir matrix are scaled with the exact factor $1/sqrtN$.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A natural hypothesis for the success of reservoir computing in generic tasks is the ability of the untrained reservoir to map different input time series to separable reservoir states - a property we term separation capacity. We provide a rigorous mathematical framework to quantify this capacity for random linear reservoirs, showing that it is fully characterised by the spectral properties of the generalised matrix of moments of the random reservoir connectivity matrix. Our analysis focuses on reservoirs with Gaussian connectivity matrices, both symmetric and i.i.d., although the techniques extend naturally to broader classes of random matrices. In the symmetric case, the generalised matrix of moments is a Hankel matrix. Using classical estimates from random matrix theory, we establish that separation capacity deteriorates over time and that, for short inputs, optimal separation in large reservoirs is achieved when the matrix entries are scaled with a factor $\rho_T/\sqrt{N}$, where $N$ is the reservoir dimension and $\rho_T$ depends on the maximum input length. In the i.i.d.\ case, we establish that optimal separation with large reservoirs is consistently achieved when the entries of the reservoir matrix are scaled with the exact factor $1/\sqrt{N}$, which aligns with common implementations of reservoir computing. We further give upper bounds on the quality of separation as a function of the length of the time series. We complement this analysis with an investigation of the likelihood of this separation and its consistency under different architectural choices.
Related papers
- Determinant Estimation under Memory Constraints and Neural Scaling Laws [48.68885778257016]
We derive a novel hierarchical algorithm for large-scale log-determinant calculation in memory-constrained settings.
We show that the ratio of pseudo-determinants satisfies a power-law relationship, allowing us to derive corresponding scaling laws.
This enables accurate estimation of NTK log-determinants from a tiny fraction of the full dataset.
arXiv Detail & Related papers (2025-03-06T13:32:13Z) - Sparse PCA with Oracle Property [115.72363972222622]
We propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations.
We prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA.
arXiv Detail & Related papers (2023-12-28T02:52:54Z) - Large-scale gradient-based training of Mixtures of Factor Analyzers [67.21722742907981]
This article contributes both a theoretical analysis as well as a new method for efficient high-dimensional training by gradient descent.
We prove that MFA training and inference/sampling can be performed based on precision matrices, which does not require matrix inversions after training is completed.
Besides the theoretical analysis and matrices, we apply MFA to typical image datasets such as SVHN and MNIST, and demonstrate the ability to perform sample generation and outlier detection.
arXiv Detail & Related papers (2023-08-26T06:12:33Z) - Singular value decomposition based matrix surgery [0.0]
We develop a procedure to reduce and control the condition number of random matrices.
We investigate the effect on the persistent homology (PH) of point clouds of well- and ill-conditioned matrices.
arXiv Detail & Related papers (2023-02-22T15:30:08Z) - Concentration of Random Feature Matrices in High-Dimensions [7.1171757928258135]
spectra of random feature matrices provide information on the conditioning of the linear system used in random feature regression problems.
We consider two settings for the two input variables, either both are random variables or one is a random variable and the other is well-separated.
arXiv Detail & Related papers (2022-04-14T13:01:27Z) - Leverage Score Sampling for Tensor Product Matrices in Input Sparsity
Time [54.65688986250061]
We give an input sparsity time sampling algorithm for approximating the Gram matrix corresponding to the $q$-fold column-wise tensor product of $q$ matrices.
Our sampling technique relies on a collection of $q$ partially correlated random projections which can be simultaneously applied to a dataset $X$ in total time.
arXiv Detail & Related papers (2022-02-09T15:26:03Z) - On Linear Separability under Linear Compression with Applications to
Hard Support Vector Machine [0.0]
We show that linear separability is maintained as long as the distortion of the inner products is smaller than the squared margin of the original data-generating distribution.
As applications, we derive bounds on the (i) compression length of random sub-Gaussian matrices; and (ii) generalization error for compressive learning with hard-SVM.
arXiv Detail & Related papers (2022-02-02T16:23:01Z) - Sublinear Time Approximation of Text Similarity Matrices [50.73398637380375]
We introduce a generalization of the popular Nystr"om method to the indefinite setting.
Our algorithm can be applied to any similarity matrix and runs in sublinear time in the size of the matrix.
We show that our method, along with a simple variant of CUR decomposition, performs very well in approximating a variety of similarity matrices.
arXiv Detail & Related papers (2021-12-17T17:04:34Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Near-Optimal Algorithms for Linear Algebra in the Current Matrix
Multiplication Time [46.31710224483631]
We show how to bypass the main open question of Nelson and Nguyen (FOCS, 2013) regarding the logarithmic factors in the sketching dimension for existing constant factor approximation oblivious subspace embeddings.
A key technique we use is an explicit mapping of Indyk based on uncertainty principles and extractors.
For the fundamental problems of rank computation and finding a linearly independent subset of columns, our algorithms improve Cheung, Kwok, and Lau (JACM, 2013) and are optimal to within a constant factor and a $loglog(n)$-factor, respectively.
arXiv Detail & Related papers (2021-07-16T19:34:10Z) - Correlations and commuting transfer matrices in integrable unitary
circuits [0.0]
We consider a unitary circuit where the underlying gates are chosen to be R-matrices satisfying the Yang-Baxter equation and correlation functions can be expressed through a transfer matrix formalism.
In all cases, the Bethe equations reduce to those of integrable spin-1 chain SU(2) symmetry, significantly reducing the total number of eigenstates required in the calculation of correlation functions.
arXiv Detail & Related papers (2021-06-01T17:10:28Z) - Sequential Estimation of Convex Divergences using Reverse Submartingales
and Exchangeable Filtrations [31.088836418378534]
We present a unified technique for sequential estimation of convex divergences between distributions.
The technical underpinnings of our approach lie in the observation that empirical convex divergences are (partially ordered) reverse submartingales.
These techniques appear to be powerful additions to the existing literature on both confidence sequences and convex divergences.
arXiv Detail & Related papers (2021-03-16T18:22:14Z) - Reservoir Computers Modal Decomposition and Optimization [0.0]
We obtain a decomposition of the reservoir dynamics into modes, which can be computed independently of one another.
We then take a parametric approach in which the eigenvalues are parameters that can be appropriately designed and optimized.
We show that manipulations of the individual modes, either in terms of the eigenvalues or the time shifts, can lead to dramatic reductions in the training error.
arXiv Detail & Related papers (2021-01-13T23:30:21Z) - Linear-Sample Learning of Low-Rank Distributions [56.59844655107251]
We show that learning $ktimes k$, rank-$r$, matrices to normalized $L_1$ distance requires $Omega(frackrepsilon2)$ samples.
We propose an algorithm that uses $cal O(frackrepsilon2log2fracepsilon)$ samples, a number linear in the high dimension, and nearly linear in the matrices, typically low, rank proofs.
arXiv Detail & Related papers (2020-09-30T19:10:32Z) - Fused-Lasso Regularized Cholesky Factors of Large Nonstationary
Covariance Matrices of Longitudinal Data [0.0]
smoothness of the subdiagonals of the Cholesky factor of large covariance matrices is closely related to the degrees of nonstationarity of autoregressive models for time series and longitudinal data.
We propose an algorithm for sparse estimation of the Cholesky factor which decouple row-wise.
arXiv Detail & Related papers (2020-07-22T02:38:16Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z) - Relative Error Bound Analysis for Nuclear Norm Regularized Matrix Completion [101.83262280224729]
We develop a relative error bound for nuclear norm regularized matrix completion.
We derive a relative upper bound for recovering the best low-rank approximation of the unknown matrix.
arXiv Detail & Related papers (2015-04-26T13:12:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.