Related papers: SALSA: Sequential Approximate Leverage-Score Algorithm with Application in Analyzing Big Time Series Data

SALSA: Sequential Approximate Leverage-Score Algorithm with Application in Analyzing Big Time Series Data

URL: http://arxiv.org/abs/2401.00122v1
Date: Sat, 30 Dec 2023 02:36:53 GMT
Title: SALSA: Sequential Approximate Leverage-Score Algorithm with Application in Analyzing Big Time Series Data
Authors: Ali Eshragh and Luke Yerbury and Asef Nazari and Fred Roosta and Michael W. Mahoney
Abstract summary: We develop a new efficient sequential approximate leverage score algorithm, SALSA, using methods from randomized numerical linear algebra. We show that the theoretical computational complexity and numerical accuracy of SALSA surpass existing approximations. Our proposed algorithm is, with high probability, guaranteed to find the maximum likelihood estimates of the parameters for the true underlying ARMA model.
Score: 46.42365692992566
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We develop a new efficient sequential approximate leverage score algorithm, SALSA, using methods from randomized numerical linear algebra (RandNLA) for large matrices. We demonstrate that, with high probability, the accuracy of SALSA's approximations is within $(1 + O({\varepsilon}))$ of the true leverage scores. In addition, we show that the theoretical computational complexity and numerical accuracy of SALSA surpass existing approximations. These theoretical results are subsequently utilized to develop an efficient algorithm, named LSARMA, for fitting an appropriate ARMA model to large-scale time series data. Our proposed algorithm is, with high probability, guaranteed to find the maximum likelihood estimates of the parameters for the true underlying ARMA model. Furthermore, it has a worst-case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale data strongly support these theoretical results and underscore the efficacy of our new approach.

Related papers

Adaptive Sampled Softmax with Inverted Multi-Index: Methods, Theory and Applications [79.53938312089308]
The MIDX-Sampler is a novel adaptive sampling strategy based on an inverted multi-index approach. Our method is backed by rigorous theoretical analysis, addressing key concerns such as sampling bias, gradient bias, convergence rates, and generalization error bounds.
arXiv Detail & Related papers (2025-01-15T04:09:21Z)
Iterative Methods for Full-Scale Gaussian Process Approximations for Large Spatial Data [9.913418444556486]
We show how iterative methods can be used to reduce the computational costs for calculating likelihoods, gradients, and predictive distributions with FSAs. We also present a novel, accurate, and fast way to calculate predictive variances relying on estimations and iterative methods. All methods are implemented in a free C++ software library with high-level Python and R packages.
arXiv Detail & Related papers (2024-05-23T12:25:22Z)
Approximate Gibbs Sampler for Efficient Inference of Hierarchical Bayesian Models for Grouped Count Data [0.0]
This research develops an approximate Gibbs sampler (AGS) to efficiently learn the HBPRMs while maintaining the inference accuracy. Numerical experiments using real and synthetic datasets with small and large counts demonstrate the superior performance of AGS.
arXiv Detail & Related papers (2022-11-28T21:00:55Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
Large-scale Optimization of Partial AUC in a Range of False Positive Rates [51.12047280149546]
The area under the ROC curve (AUC) is one of the most widely used performance measures for classification models in machine learning. We develop an efficient approximated gradient descent method based on recent practical envelope smoothing technique. Our proposed algorithm can also be used to minimize the sum of some ranked range loss, which also lacks efficient solvers.
arXiv Detail & Related papers (2022-03-03T03:46:18Z)
Toeplitz Least Squares Problems, Fast Algorithms and Big Data [1.3535770763481905]
Two recent algorithms have applied randomized numerical linear algebra techniques to fitting an autoregressive model to big time-series data. We investigate and compare the quality of these two approximation algorithms on large-scale synthetic and real-world data. While both algorithms display comparable results for synthetic datasets, the LSAR algorithm appears to be more robust when applied to real-world time series data.
arXiv Detail & Related papers (2021-12-24T08:32:09Z)
An Improved Frequent Directions Algorithm for Low-Rank Approximation via Block Krylov Iteration [11.62834880315581]
This paper presents a fast and accurate Frequent Directions algorithm named as r-BKIFD. The proposed r-BKIFD has a comparable error bound with original Frequent Directions, and the approximation error can be arbitrarily small when the number of iterations is chosen appropriately.
arXiv Detail & Related papers (2021-09-24T01:36:42Z)
Dual Optimization for Kolmogorov Model Learning Using Enhanced Gradient Descent [8.714458129632158]
Kolmogorov model (KM) is an interpretable and predictable representation approach to learning the underlying probabilistic structure of a set of random variables. We propose a computationally scalable KM learning algorithm, based on the regularized dual optimization combined with enhanced gradient descent (GD) method. It is shown that the accuracy of logical relation mining for interpretability by using the proposed KM learning algorithm exceeds $80%$.
arXiv Detail & Related papers (2021-07-11T10:33:02Z)
Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank. Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z)
Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing [49.73889315176884]
We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions. We build the connections between the theory of approximate maximum inner product search and the regret analysis of reinforcement learning.
arXiv Detail & Related papers (2021-05-18T05:23:53Z)
Online Model Selection for Reinforcement Learning with Function Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret. We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.