Spectral learning of multivariate extremes
- URL: http://arxiv.org/abs/2111.07799v3
- Date: Tue, 1 Aug 2023 16:05:00 GMT
- Title: Spectral learning of multivariate extremes
- Authors: Marco Avella Medina, Richard A. Davis and Gennady Samorodnitsky
- Abstract summary: We propose a spectral clustering algorithm for analyzing the dependence structure of multivariate extremes.
Our work studies the theoretical performance of spectral clustering based on a random $k-nearest neighbor graph constructed from an extremal sample.
We propose a simple consistent estimation strategy for learning the angular measure.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a spectral clustering algorithm for analyzing the dependence
structure of multivariate extremes. More specifically, we focus on the
asymptotic dependence of multivariate extremes characterized by the angular or
spectral measure in extreme value theory. Our work studies the theoretical
performance of spectral clustering based on a random $k$-nearest neighbor graph
constructed from an extremal sample, i.e., the angular part of random vectors
for which the radius exceeds a large threshold. In particular, we derive the
asymptotic distribution of extremes arising from a linear factor model and
prove that, under certain conditions, spectral clustering can consistently
identify the clusters of extremes arising in this model. Leveraging this result
we propose a simple consistent estimation strategy for learning the angular
measure. Our theoretical findings are complemented with numerical experiments
illustrating the finite sample performance of our methods.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Spectral Phase Transition and Optimal PCA in Block-Structured Spiked
models [20.742571160909456]
We discuss the inhomogeneous spiked Wigner model, a theoretical framework recently introduced to study structured noise in various learning scenarios.
Our primary objective is to find an optimal spectral method and to extend the celebrated citeBBP (BBP) phase transition criterion to our inhomogeneous, block-structured, Wigner model.
arXiv Detail & Related papers (2024-03-06T13:23:55Z) - Kernel PCA for multivariate extremes [0.0]
We show that kernel PCA can be a powerful tool for clustering and dimension reduction.
We give theoretical guarantees on the performance of kernel PCA based on an extremal sample.
Our findings are complemented by numerical experiments illustrating the finite performance of our methods.
arXiv Detail & Related papers (2022-11-23T17:50:06Z) - Precise Asymptotics for Spectral Methods in Mixed Generalized Linear Models [31.58736590532443]
We consider the problem of estimating two statistically independent signals in a mixed generalized linear model.
Our characterization exploits a mix of tools from random matrices, free probability and the theory of approximate message passing algorithms.
arXiv Detail & Related papers (2022-11-21T11:35:25Z) - Perfect Spectral Clustering with Discrete Covariates [68.8204255655161]
We propose a spectral algorithm that achieves perfect clustering with high probability on a class of large, sparse networks.
Our method is the first to offer a guarantee of consistent latent structure recovery using spectral clustering.
arXiv Detail & Related papers (2022-05-17T01:41:06Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - Revisiting the Sample Complexity of Sparse Spectrum Approximation of
Gaussian Processes [60.479499225746295]
We introduce a new scalable approximation for Gaussian processes with provable guarantees which hold simultaneously over its entire parameter space.
Our approximation is obtained from an improved sample complexity analysis for sparse spectrum Gaussian processes (SSGPs)
arXiv Detail & Related papers (2020-11-17T05:41:50Z) - Beyond Random Matrix Theory for Deep Networks [0.7614628596146599]
We investigate whether Wigner semi-circle and Marcenko-Pastur distributions, often used for deep neural network theoretical analysis, match empirically observed spectral densities.
We find that even allowing for outliers, the observed spectral shapes strongly deviate from such theoretical predictions.
We consider two new classes of matrix ensembles; random Wigner/Wishart ensemble products and percolated Wigner/Wishart ensembles, both of which better match observed spectra.
arXiv Detail & Related papers (2020-06-13T21:00:30Z) - Profile Entropy: A Fundamental Measure for the Learnability and
Compressibility of Discrete Distributions [63.60499266361255]
We show that for samples of discrete distributions, profile entropy is a fundamental measure unifying the concepts of estimation, inference, and compression.
Specifically, profile entropy a) determines the speed of estimating the distribution relative to the best natural estimator; b) characterizes the rate of inferring all symmetric properties compared with the best estimator over any label-invariant distribution collection; c) serves as the limit of profile compression.
arXiv Detail & Related papers (2020-02-26T17:49:04Z) - Randomized Spectral Clustering in Large-Scale Stochastic Block Models [13.366036495177923]
We study the spectral clustering using randomized sketching algorithms from a statistical perspective.
Under mild conditions, the randomized spectral clustering algorithms lead to the same theoretical bounds as those of the original spectral clustering algorithm.
A new R package called Rclust is developed and made available to the public.
arXiv Detail & Related papers (2020-01-20T04:15:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.