SALSA: Sequential Approximate Leverage-Score Algorithm with Application
in Analyzing Big Time Series Data
- URL: http://arxiv.org/abs/2401.00122v1
- Date: Sat, 30 Dec 2023 02:36:53 GMT
- Title: SALSA: Sequential Approximate Leverage-Score Algorithm with Application
in Analyzing Big Time Series Data
- Authors: Ali Eshragh and Luke Yerbury and Asef Nazari and Fred Roosta and
Michael W. Mahoney
- Abstract summary: We develop a new efficient sequential approximate leverage score algorithm, SALSA, using methods from randomized numerical linear algebra.
We show that the theoretical computational complexity and numerical accuracy of SALSA surpass existing approximations.
Our proposed algorithm is, with high probability, guaranteed to find the maximum likelihood estimates of the parameters for the true underlying ARMA model.
- Score: 46.42365692992566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop a new efficient sequential approximate leverage score algorithm,
SALSA, using methods from randomized numerical linear algebra (RandNLA) for
large matrices. We demonstrate that, with high probability, the accuracy of
SALSA's approximations is within $(1 + O({\varepsilon}))$ of the true leverage
scores. In addition, we show that the theoretical computational complexity and
numerical accuracy of SALSA surpass existing approximations. These theoretical
results are subsequently utilized to develop an efficient algorithm, named
LSARMA, for fitting an appropriate ARMA model to large-scale time series data.
Our proposed algorithm is, with high probability, guaranteed to find the
maximum likelihood estimates of the parameters for the true underlying ARMA
model. Furthermore, it has a worst-case running time that significantly
improves those of the state-of-the-art alternatives in big data regimes.
Empirical results on large-scale data strongly support these theoretical
results and underscore the efficacy of our new approach.
Related papers
- Iterative Methods for Full-Scale Gaussian Process Approximations for Large Spatial Data [9.913418444556486]
We show how iterative methods can be used to reduce the computational costs for calculating likelihoods, gradients, and predictive distributions with FSAs.
We also present a novel, accurate, and fast way to calculate predictive variances relying on estimations and iterative methods.
All methods are implemented in a free C++ software library with high-level Python and R packages.
arXiv Detail & Related papers (2024-05-23T12:25:22Z) - Approximate Gibbs Sampler for Efficient Inference of Hierarchical Bayesian Models for Grouped Count Data [0.0]
This research develops an approximate Gibbs sampler (AGS) to efficiently learn the HBPRMs while maintaining the inference accuracy.
Numerical experiments using real and synthetic datasets with small and large counts demonstrate the superior performance of AGS.
arXiv Detail & Related papers (2022-11-28T21:00:55Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Large-scale Optimization of Partial AUC in a Range of False Positive
Rates [51.12047280149546]
The area under the ROC curve (AUC) is one of the most widely used performance measures for classification models in machine learning.
We develop an efficient approximated gradient descent method based on recent practical envelope smoothing technique.
Our proposed algorithm can also be used to minimize the sum of some ranked range loss, which also lacks efficient solvers.
arXiv Detail & Related papers (2022-03-03T03:46:18Z) - Toeplitz Least Squares Problems, Fast Algorithms and Big Data [1.3535770763481905]
Two recent algorithms have applied randomized numerical linear algebra techniques to fitting an autoregressive model to big time-series data.
We investigate and compare the quality of these two approximation algorithms on large-scale synthetic and real-world data.
While both algorithms display comparable results for synthetic datasets, the LSAR algorithm appears to be more robust when applied to real-world time series data.
arXiv Detail & Related papers (2021-12-24T08:32:09Z) - An Improved Frequent Directions Algorithm for Low-Rank Approximation via
Block Krylov Iteration [11.62834880315581]
This paper presents a fast and accurate Frequent Directions algorithm named as r-BKIFD.
The proposed r-BKIFD has a comparable error bound with original Frequent Directions, and the approximation error can be arbitrarily small when the number of iterations is chosen appropriately.
arXiv Detail & Related papers (2021-09-24T01:36:42Z) - Dual Optimization for Kolmogorov Model Learning Using Enhanced Gradient
Descent [8.714458129632158]
Kolmogorov model (KM) is an interpretable and predictable representation approach to learning the underlying probabilistic structure of a set of random variables.
We propose a computationally scalable KM learning algorithm, based on the regularized dual optimization combined with enhanced gradient descent (GD) method.
It is shown that the accuracy of logical relation mining for interpretability by using the proposed KM learning algorithm exceeds $80%$.
arXiv Detail & Related papers (2021-07-11T10:33:02Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing [49.73889315176884]
We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions.
We build the connections between the theory of approximate maximum inner product search and the regret analysis of reinforcement learning.
arXiv Detail & Related papers (2021-05-18T05:23:53Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.