Asymptotic spectrum of weighted sample covariance: another proof of spectrum convergence
- URL: http://arxiv.org/abs/2410.14408v2
- Date: Thu, 13 Mar 2025 14:03:29 GMT
- Title: Asymptotic spectrum of weighted sample covariance: another proof of spectrum convergence
- Authors: Benoit Oriol,
- Abstract summary: We show how the spectrum behaves in finite samples with heavy tails.<n>The general purpose is to provide a detailed introduction to the high dimensional spectrum of weighted sample covariance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We propose another proof of the high dimensional spectrum convergence of the weighted sample covariance, more concise and self-sufficient but with stronger, but reasonable assumptions. We explain and illustrates this theorem for different weight distributions and show how the spectrum behaves in finite samples with heavy tails. The general purpose is to provide a detailed introduction to the high dimensional spectrum of weighted sample covariance.
Related papers
- Asymptotic non-linear shrinkage formulas for weighted sample covariance [0.0]
We compute non-linear shrinkage formulas in the spirit of Ledoit and P'ech'e.
We show experimentally the performance of the non-linear shrinkage formulas.
We also test the robustness of the theory to a heavy-tailed distribution.
arXiv Detail & Related papers (2024-10-18T12:33:10Z) - WeSpeR: Population spectrum retrieval and spectral density estimation of weighted sample covariance [0.0]
We prove that the spectral distribution $F$ of the weighted sample covariance has a continuous density on $mathbbR*$.
We propose a procedure to compute it, to determine the support of $F$ and define an efficient grid on it.
We use this procedure to design the $textitWeSpeR$ algorithm, which estimates the spectral density and retrieves the true spectral covariance spectrum.
arXiv Detail & Related papers (2024-10-18T12:26:51Z) - Unified Convergence Analysis for Score-Based Diffusion Models with Deterministic Samplers [49.1574468325115]
We introduce a unified convergence analysis framework for deterministic samplers.
Our framework achieves iteration complexity of $tilde O(d2/epsilon)$.
We also provide a detailed analysis of Denoising Implicit Diffusion Models (DDIM)-type samplers.
arXiv Detail & Related papers (2024-10-18T07:37:36Z) - Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization [83.06112052443233]
This paper studies kernel ridge regression in high dimensions under covariate shifts.
By a bias-variance decomposition, we theoretically demonstrate that the re-weighting strategy allows for decreasing the variance.
For bias, we analyze the regularization of the arbitrary or well-chosen scale, showing that the bias can behave very differently under different regularization scales.
arXiv Detail & Related papers (2024-06-05T12:03:27Z) - Flow matching achieves almost minimax optimal convergence [50.38891696297888]
Flow matching (FM) has gained significant attention as a simulation-free generative model.
This paper discusses the convergence properties of FM for large sample size under the $p$-Wasserstein distance.
We establish that FM can achieve an almost minimax optimal convergence rate for $1 leq p leq 2$, presenting the first theoretical evidence that FM can reach convergence rates comparable to those of diffusion models.
arXiv Detail & Related papers (2024-05-31T14:54:51Z) - Anomaly Detection with Variance Stabilized Density Estimation [49.46356430493534]
We present a variance-stabilized density estimation problem for maximizing the likelihood of the observed samples.
To obtain a reliable anomaly detector, we introduce a spectral ensemble of autoregressive models for learning the variance-stabilized distribution.
We have conducted an extensive benchmark with 52 datasets, demonstrating that our method leads to state-of-the-art results.
arXiv Detail & Related papers (2023-06-01T11:52:58Z) - Mean-Square Analysis of Discretized It\^o Diffusions for Heavy-tailed
Sampling [17.415391025051434]
We analyze the complexity of sampling from a class of heavy-tailed distributions by discretizing a natural class of Ito diffusions associated with weighted Poincar'e inequalities.
Based on a mean-square analysis, we establish the iteration complexity for obtaining a sample whose distribution is $epsilon$ close to the target distribution in the Wasserstein-2 metric.
arXiv Detail & Related papers (2023-03-01T15:16:03Z) - Spectral Feature Augmentation for Graph Contrastive Learning and Beyond [64.78221638149276]
We present a novel spectral feature argumentation for contrastive learning on graphs (and images)
For each data view, we estimate a low-rank approximation per feature map and subtract that approximation from the map to obtain its complement.
This is achieved by the proposed herein incomplete power iteration, a non-standard power regime which enjoys two valuable byproducts (under mere one or two iterations)
Experiments on graph/image datasets show that our spectral feature augmentation outperforms baselines.
arXiv Detail & Related papers (2022-12-02T08:48:11Z) - On the Semi-supervised Expectation Maximization [5.481082183778667]
We focus on a semi-supervised case to learn the model from labeled and unlabeled samples.
The analysis clearly demonstrates how the labeled samples improve the convergence rate for the exponential family mixture model.
arXiv Detail & Related papers (2022-11-01T15:42:57Z) - Tuning Stochastic Gradient Algorithms for Statistical Inference via
Large-Sample Asymptotics [18.93569692490218]
tuning of gradient algorithms often based on trial-and-error rather than generalizable theory.
We show that averaging with a large fixed step size is robust to the choice of tuning parameters.
We lay the foundation for a systematic analysis of other gradient Monte Carlo algorithms.
arXiv Detail & Related papers (2022-07-25T17:58:09Z) - Recover the spectrum of covariance matrix: a non-asymptotic iterative
method [0.0]
It is well known the sample covariance has a consistent bias in the spectrum, for example spectrum of Wishart matrix follows the Marchenko-Pastur law.
We in this work introduce an iterative algorithm 'Concent' that actively eliminate this bias and recover the true spectrum for small and moderate dimensions.
arXiv Detail & Related papers (2022-01-01T18:44:31Z) - Spectral learning of multivariate extremes [0.0]
We propose a spectral clustering algorithm for analyzing the dependence structure of multivariate extremes.
Our work studies the theoretical performance of spectral clustering based on a random $k-nearest neighbor graph constructed from an extremal sample.
We propose a simple consistent estimation strategy for learning the angular measure.
arXiv Detail & Related papers (2021-11-15T14:33:06Z) - Deterministic Gibbs Sampling via Ordinary Differential Equations [77.42706423573573]
This paper presents a general construction of deterministic measure-preserving dynamics using autonomous ODEs and tools from differential geometry.
We show how Hybrid Monte Carlo and other deterministic samplers follow as special cases of our theory.
arXiv Detail & Related papers (2021-06-18T15:36:09Z) - Beyond Random Matrix Theory for Deep Networks [0.7614628596146599]
We investigate whether Wigner semi-circle and Marcenko-Pastur distributions, often used for deep neural network theoretical analysis, match empirically observed spectral densities.
We find that even allowing for outliers, the observed spectral shapes strongly deviate from such theoretical predictions.
We consider two new classes of matrix ensembles; random Wigner/Wishart ensemble products and percolated Wigner/Wishart ensembles, both of which better match observed spectra.
arXiv Detail & Related papers (2020-06-13T21:00:30Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Profile Entropy: A Fundamental Measure for the Learnability and
Compressibility of Discrete Distributions [63.60499266361255]
We show that for samples of discrete distributions, profile entropy is a fundamental measure unifying the concepts of estimation, inference, and compression.
Specifically, profile entropy a) determines the speed of estimating the distribution relative to the best natural estimator; b) characterizes the rate of inferring all symmetric properties compared with the best estimator over any label-invariant distribution collection; c) serves as the limit of profile compression.
arXiv Detail & Related papers (2020-02-26T17:49:04Z) - Minimax Optimal Estimation of KL Divergence for Continuous Distributions [56.29748742084386]
Esting Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains.
One simple and effective estimator is based on the k nearest neighbor between these samples.
arXiv Detail & Related papers (2020-02-26T16:37:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.