WeSpeR: Population spectrum retrieval and spectral density estimation of weighted sample covariance
- URL: http://arxiv.org/abs/2410.14413v1
- Date: Fri, 18 Oct 2024 12:26:51 GMT
- Title: WeSpeR: Population spectrum retrieval and spectral density estimation of weighted sample covariance
- Authors: Benoit Oriol,
- Abstract summary: We prove that the spectral distribution $F$ of the weighted sample covariance has a continuous density on $mathbbR*$.
We propose a procedure to compute it, to determine the support of $F$ and define an efficient grid on it.
We use this procedure to design the $textitWeSpeR$ algorithm, which estimates the spectral density and retrieves the true spectral covariance spectrum.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The spectrum of the weighted sample covariance shows a asymptotic non random behavior when the dimension grows with the number of samples. In this setting, we prove that the asymptotic spectral distribution $F$ of the weighted sample covariance has a continuous density on $\mathbb{R}^*$. We address then the practical problem of numerically finding this density. We propose a procedure to compute it, to determine the support of $F$ and define an efficient grid on it. We use this procedure to design the $\textit{WeSpeR}$ algorithm, which estimates the spectral density and retrieves the true spectral covariance spectrum. Empirical tests confirm the good properties of the $\textit{WeSpeR}$ algorithm.
Related papers
- Fast and Robust: Computationally Efficient Covariance Estimation for Sub-Weibull Vectors [0.0]
In this work, we target the specific regime of textbfSub-Weibull (characterized by stretched exponential tails $exp(-t)$)<n>Unlike element-wise truncation, our approach preserves the spectral geometry while requiring $O(Nd2)$ operations.<n>We derive non-asymptotic error bounds showing that our estimator recovers the optimal sub-Gaussian rate $tildeO(sqrtr()/N)$ with high probability.
arXiv Detail & Related papers (2025-12-19T14:34:30Z) - Restricted Spectral Gap Decomposition for Simulated Tempering Targeting Mixture Distributions [3.7577421880330535]
We consider simulated tempering combined with an arbitrary local chain Monte Carlo sampler.<n>We present a new decomposition theorem that provides a lower bound on the restricted spectral gap of the algorithm for sampling from mixture distributions.
arXiv Detail & Related papers (2025-05-21T03:28:55Z) - The Performance Of The Unadjusted Langevin Algorithm Without Smoothness Assumptions [0.0]
We propose a Langevin-based algorithm that does not rely on popular but computationally challenging techniques.<n>We show that the performance of samplers like ULA does not necessarily degenerate arbitrarily with low regularity.
arXiv Detail & Related papers (2025-02-05T18:55:54Z) - Asymptotic non-linear shrinkage and eigenvector overlap for weighted sample covariance [0.0]
We compute non-linear shrinkage formulas in the spirit of Ledoit and P'ech'e.<n>We show the performance of the non-linear shrinkage estimators and propose an algorithm to numerically compute those formulas.
arXiv Detail & Related papers (2024-10-18T12:33:10Z) - A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal.
Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z) - Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$)
We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z) - Variance Reduction for the Independent Metropolis Sampler [11.074080383657453]
We prove that if $pi$ is close enough under KL divergence to another density $q$, an independent sampler that obtains samples from $pi$ achieves smaller variance than i.i.d. sampling from $pi$.
We propose an adaptive independent Metropolis algorithm that adapts the proposal density such that its KL divergence with the target is being reduced.
arXiv Detail & Related papers (2024-06-25T16:38:53Z) - In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies [7.70133333709347]
We present a new random walk for uniformly sampling high-dimensional convex bodies.
It achieves state-of-the-art runtime complexity with stronger guarantees on the output.
arXiv Detail & Related papers (2024-05-02T16:15:46Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - On the Convergence Rate of Gaussianization with Random Rotations [0.8081564951955755]
We show analytically that the number of required layers scales linearly with the dimension for Gaussian input.
We find the same linear increase in cost for arbitrary input $p(x)$, but observe favorable scaling for some distributions.
arXiv Detail & Related papers (2023-06-23T14:35:39Z) - Anomaly Detection with Variance Stabilized Density Estimation [49.46356430493534]
We present a variance-stabilized density estimation problem for maximizing the likelihood of the observed samples.
To obtain a reliable anomaly detector, we introduce a spectral ensemble of autoregressive models for learning the variance-stabilized distribution.
We have conducted an extensive benchmark with 52 datasets, demonstrating that our method leads to state-of-the-art results.
arXiv Detail & Related papers (2023-06-01T11:52:58Z) - Mean-Square Analysis of Discretized It\^o Diffusions for Heavy-tailed
Sampling [17.415391025051434]
We analyze the complexity of sampling from a class of heavy-tailed distributions by discretizing a natural class of Ito diffusions associated with weighted Poincar'e inequalities.
Based on a mean-square analysis, we establish the iteration complexity for obtaining a sample whose distribution is $epsilon$ close to the target distribution in the Wasserstein-2 metric.
arXiv Detail & Related papers (2023-03-01T15:16:03Z) - Projected Gradient Descent Algorithms for Solving Nonlinear Inverse
Problems with Generative Priors [17.426500577203505]
We assume that the unknown $p$-dimensional signal lies near the range of an $L$-Lipschitz continuous generative model with bounded $k$-dimensional inputs.
We propose a nonlinear least-squares estimator that is guaranteed to enjoy an optimal statistical rate.
arXiv Detail & Related papers (2022-09-21T04:05:12Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Spatially relaxed inference on high-dimensional linear models [48.989769153211995]
We study the properties of ensembled clustered inference algorithms which combine spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions.
We show that ensembled clustered inference algorithms control the $delta$-FWER under standard assumptions for $delta$ equal to the largest cluster diameter.
arXiv Detail & Related papers (2021-06-04T16:37:19Z) - Stochastic Approximation for Online Tensorial Independent Component
Analysis [98.34292831923335]
Independent component analysis (ICA) has been a popular dimension reduction tool in statistical machine learning and signal processing.
In this paper, we present a by-product online tensorial algorithm that estimates for each independent component.
arXiv Detail & Related papers (2020-12-28T18:52:37Z) - Spectral density estimation with the Gaussian Integral Transform [91.3755431537592]
spectral density operator $hatrho(omega)=delta(omega-hatH)$ plays a central role in linear response theory.
We describe a near optimal quantum algorithm providing an approximation to the spectral density.
arXiv Detail & Related papers (2020-04-10T03:14:38Z) - On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and
Non-Asymptotic Concentration [115.1954841020189]
We study the inequality and non-asymptotic properties of approximation procedures with Polyak-Ruppert averaging.
We prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.
arXiv Detail & Related papers (2020-04-09T17:54:18Z) - Oracle Lower Bounds for Stochastic Gradient Sampling Algorithms [39.746670539407084]
We consider the problem of sampling from a strongly log-concave density in $bbRd$.
We prove an information theoretic lower bound on the number of gradient queries of the log density needed.
arXiv Detail & Related papers (2020-02-01T23:46:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.