The noise level in linear regression with dependent data
- URL: http://arxiv.org/abs/2305.11165v2
- Date: Fri, 27 Oct 2023 16:34:44 GMT
- Title: The noise level in linear regression with dependent data
- Authors: Ingvar Ziemann, Stephen Tu, George J. Pappas, Nikolai Matni
- Abstract summary: We derive upper bounds for random design linear regression with dependent ($beta$-mixing) data.
Up to constant factors, our analysis correctly recovers the variance term predicted by the Central Limit Theorem.
- Score: 36.252641692809924
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We derive upper bounds for random design linear regression with dependent
($\beta$-mixing) data absent any realizability assumptions. In contrast to the
strictly realizable martingale noise regime, no sharp instance-optimal
non-asymptotics are available in the literature. Up to constant factors, our
analysis correctly recovers the variance term predicted by the Central Limit
Theorem -- the noise level of the problem -- and thus exhibits graceful
degradation as we introduce misspecification. Past a burn-in, our result is
sharp in the moderate deviations regime, and in particular does not inflate the
leading order term by mixing time factors.
Related papers
- Gradient Normalization with(out) Clipping Ensures Convergence of Nonconvex SGD under Heavy-Tailed Noise with Improved Results [60.92029979853314]
This paper investigates Gradient Normalization without (NSGDC) its gradient reduction variant (NSGDC-VR)
We present significant improvements in the theoretical results for both algorithms.
arXiv Detail & Related papers (2024-10-21T22:40:42Z) - High-probability Convergence Bounds for Nonlinear Stochastic Gradient Descent Under Heavy-tailed Noise [59.25598762373543]
We show that wetailed high-prob convergence guarantees of learning on streaming data in the presence of heavy-tailed noise.
We demonstrate analytically and that $ta$ can be used to the preferred choice of setting for a given problem.
arXiv Detail & Related papers (2023-10-28T18:53:41Z) - Asymptotic consistency of the WSINDy algorithm in the limit of continuum
data [0.0]
We study the consistency of the weak-form sparse identification of nonlinear dynamics algorithm (WSINDy)
We provide a mathematically rigorous explanation for the observed robustness to noise of weak-form equation learning.
arXiv Detail & Related papers (2022-11-29T07:49:34Z) - Nonlinear gradient mappings and stochastic optimization: A general
framework with applications to heavy-tail noise [11.768495184175052]
We introduce a general framework for nonlinear gradient descent scenarios when gradient noise exhibits heavy tails.
We show that for a nonlinearity with bounded outputs and for the gradient noise that may not have finite moments of order greater than one, the nonlinear SGD converges to zero at rate$O(/tzeta)$, $zeta in (0,1)$.
Experiments show that, while our framework is more general than existing studies of SGD under heavy-tail noise, several easy-to-implement nonlinearities from our framework are competitive with state of the art alternatives on real data sets
arXiv Detail & Related papers (2022-04-06T06:05:52Z) - Improving Generalization via Uncertainty Driven Perturbations [107.45752065285821]
We consider uncertainty-driven perturbations of the training data points.
Unlike loss-driven perturbations, uncertainty-guided perturbations do not cross the decision boundary.
We show that UDP is guaranteed to achieve the robustness margin decision on linear models.
arXiv Detail & Related papers (2022-02-11T16:22:08Z) - Analyzing and Improving the Optimization Landscape of Noise-Contrastive
Estimation [50.85788484752612]
Noise-contrastive estimation (NCE) is a statistically consistent method for learning unnormalized probabilistic models.
It has been empirically observed that the choice of the noise distribution is crucial for NCE's performance.
In this work, we formally pinpoint reasons for NCE's poor performance when an inappropriate noise distribution is used.
arXiv Detail & Related papers (2021-10-21T16:57:45Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - On the Role of Entropy-based Loss for Learning Causal Structures with
Continuous Optimization [27.613220411996025]
A method with non-combinatorial directed acyclic constraint, called NOTEARS, formulates the causal structure learning problem as a continuous optimization problem using least-square loss.
We show that the violation of the Gaussian noise assumption will hinder the causal direction identification.
We propose a more general entropy-based loss that is theoretically consistent with the likelihood score under any noise distribution.
arXiv Detail & Related papers (2021-06-05T08:29:51Z) - Generalization Error Rates in Kernel Regression: The Crossover from the
Noiseless to Noisy Regime [29.731516232010343]
We consider Kernel Ridge Regression (KRR) under the Gaussian design.
We show the existence of a transition in the noisy setting between the noiseless exponents to its noisy values as the sample complexity is increased.
arXiv Detail & Related papers (2021-05-31T14:39:08Z) - Distribution-Free Robust Linear Regression [5.532477732693]
We study random design linear regression with no assumptions on the distribution of the covariates.
We construct a non-linear estimator achieving excess risk of order $d/n$ with the optimal sub-exponential tail.
We prove an optimal version of the classical bound for the truncated least squares estimator due to Gy"orfi, Kohler, Krzyzak, and Walk.
arXiv Detail & Related papers (2021-02-25T15:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.