Related papers: Streaming Sparse Linear Regression

Streaming Sparse Linear Regression

URL: http://arxiv.org/abs/2211.06039v1
Date: Fri, 11 Nov 2022 07:31:55 GMT
Title: Streaming Sparse Linear Regression
Authors: Shuoguang Yang, Yuhao Yan, Xiuneng Zhu, Qiang Sun
Abstract summary: We propose a novel online sparse linear regression framework for analyzing streaming data when data points arrive sequentially. Our proposed method is memory efficient and requires less stringent restricted strong convexity assumptions.
Score: 1.8707139489039097
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sparse regression has been a popular approach to perform variable selection and enhance the prediction accuracy and interpretability of the resulting statistical model. Existing approaches focus on offline regularized regression, while the online scenario has rarely been studied. In this paper, we propose a novel online sparse linear regression framework for analyzing streaming data when data points arrive sequentially. Our proposed method is memory efficient and requires less stringent restricted strong convexity assumptions. Theoretically, we show that with a properly chosen regularization parameter, the $\ell_2$-norm statistical error of our estimator diminishes to zero in the optimal order of $\tilde{O}({\sqrt{s/t}})$, where $s$ is the sparsity level, $t$ is the streaming sample size, and $\tilde{O}(\cdot)$ hides logarithmic terms. Numerical experiments demonstrate the practical efficiency of our algorithm.

Related papers

Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update [60.414548453838506]
We study the generalized linear bandit (GLB) problem, a contextual multi-armed bandit framework that extends the classical linear model by incorporating a non-linear link function.<n>GLBs are widely applicable to real-world scenarios, but their non-linear nature introduces significant challenges in achieving both computational and statistical efficiency.<n>We propose a jointly efficient algorithm that attains a nearly optimal regret bound with $mathcalO(1)$ time and space complexities per round.
arXiv Detail & Related papers (2025-07-16T02:24:21Z)
Model-free Online Learning for the Kalman Filter: Forgetting Factor and Logarithmic Regret [2.313314525234138]
We consider the problem of online prediction for an unknown, non-explosive linear system.<n>With a known system model, the optimal predictor is the celebrated Kalman filter.<n>We tackle this problem by injecting an inductive bias into the regression model via exponential forgetting
arXiv Detail & Related papers (2025-05-13T21:49:56Z)
LFFR: Logistic Function For (single-output) Regression [0.0]
We implement privacy-preserving regression training using data encrypted under a fully homomorphic encryption scheme. We develop a novel and efficient algorithm called LFFR for homomorphic regression using the logistic function.
arXiv Detail & Related papers (2024-07-13T17:33:49Z)
Stochastic Optimization Algorithms for Instrumental Variable Regression with Streaming Data [17.657917523817243]
We develop and analyze algorithms for instrumental variable regression by viewing the problem as a conditional optimization problem. In the context of least-squares instrumental variable regression, our algorithms neither require matrix inversions nor mini-batches. We derive rates of convergence in expectation, that are of order $mathcalO(log T/T)$ and $mathcalO (1/T1-iota)$ for any $iota>0$.
arXiv Detail & Related papers (2024-05-29T19:21:55Z)
Optimal Bias-Correction and Valid Inference in High-Dimensional Ridge Regression: A Closed-Form Solution [0.0]
We introduce an iterative strategy to correct bias effectively when the dimension $p$ is less than the sample size $n$. For $p>n$, our method optimally mitigates the bias such that any remaining bias in the proposed de-biased estimator is unattainable. Our method offers a transformative solution to the bias challenge in ridge regression inferences across various disciplines.
arXiv Detail & Related papers (2024-05-01T10:05:19Z)
Online non-parametric likelihood-ratio estimation by Pearson-divergence functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time. We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z)
Retire: Robust Expectile Regression in High Dimensions [3.9391041278203978]
Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data. We propose and study (penalized) robust expectile regression (retire) We show that the proposed procedure can be efficiently solved by a semismooth Newton coordinate descent algorithm.
arXiv Detail & Related papers (2022-12-11T18:03:12Z)
Efficient Truncated Linear Regression with Unknown Noise Variance [26.870279729431328]
We provide the first computationally and statistically efficient estimators for truncated linear regression when the noise variance is unknown. Our estimator is based on an efficient implementation of Projected Gradient Descent on the negative-likelihood of the truncated sample.
arXiv Detail & Related papers (2022-08-25T12:17:37Z)
Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples. We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z)
SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets. Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z)
Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$. The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z)
Conditional Uncorrelation and Efficient Non-approximate Subset Selection in Sparse Regression [72.84177488527398]
We consider sparse regression from the view of correlation, and propose the formula of conditional uncorrelation. By the proposed method, the computational complexity is reduced from $O(frac16k3+mk2+mkd)$ to $O(frac16k3+frac12mk2)$ for each candidate subset in sparse regression.
arXiv Detail & Related papers (2020-09-08T20:32:26Z)
Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms [69.45237691598774]
We study the problem of least squares linear regression where the data-points are dependent and are sampled from a Markov chain. We establish sharp information theoretic minimax lower bounds for this problem in terms of $tau_mathsfmix$. We propose an algorithm based on experience replay--a popular reinforcement learning technique--that achieves a significantly better error rate.
arXiv Detail & Related papers (2020-06-16T04:26:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.