Affine-Invariant Integrated Rank-Weighted Depth: Definition, Properties
and Finite Sample Analysis
- URL: http://arxiv.org/abs/2106.11068v1
- Date: Mon, 21 Jun 2021 12:53:37 GMT
- Title: Affine-Invariant Integrated Rank-Weighted Depth: Definition, Properties
and Finite Sample Analysis
- Authors: Guillaume Staerman, Pavlo Mozharovskyi, St\'ephan Cl\'emen\c{c}on
- Abstract summary: We propose an extension of the textitintegrated rank-weighted statistical depth (IRW depth in abbreviated form) originally introduced in citeIRW.
The variant we propose, referred to as the Affine-Invariant IRW depth (AI-IRW in short), involves the covariance/precision matrices of the (supposedly square integrable) $d$-dimensional random vector $X$ under study.
- Score: 1.2891210250935146
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Because it determines a center-outward ordering of observations in
$\mathbb{R}^d$ with $d\geq 2$, the concept of statistical depth permits to
define quantiles and ranks for multivariate data and use them for various
statistical tasks (\textit{e.g.} inference, hypothesis testing). Whereas many
depth functions have been proposed \textit{ad-hoc} in the literature since the
seminal contribution of \cite{Tukey75}, not all of them possess the properties
desirable to emulate the notion of quantile function for univariate probability
distributions. In this paper, we propose an extension of the \textit{integrated
rank-weighted} statistical depth (IRW depth in abbreviated form) originally
introduced in \cite{IRW}, modified in order to satisfy the property of
\textit{affine-invariance}, fulfilling thus all the four key axioms listed in
the nomenclature elaborated by \cite{ZuoS00a}. The variant we propose, referred
to as the Affine-Invariant IRW depth (AI-IRW in short), involves the
covariance/precision matrices of the (supposedly square integrable)
$d$-dimensional random vector $X$ under study, in order to take into account
the directions along which $X$ is most variable to assign a depth value to any
point $x\in \mathbb{R}^d$. The accuracy of the sampling version of the AI-IRW
depth is investigated from a nonasymptotic perspective. Namely, a concentration
result for the statistical counterpart of the AI-IRW depth is proved. Beyond
the theoretical analysis carried out, applications to anomaly detection are
considered and numerical results are displayed, providing strong empirical
evidence of the relevance of the depth function we propose here.
Related papers
- Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - Statistical Depth Functions for Ranking Distributions: Definitions,
Statistical Learning and Applications [3.7564482287844205]
The concept of median/consensus has been widely investigated in order to provide a statistical summary of ranking data.
It is the purpose of this paper to define analogs of quantiles, ranks and statistical procedures based on such quantities.
arXiv Detail & Related papers (2022-01-20T10:30:56Z) - Eikonal depth: an optimal control approach to statistical depths [0.7614628596146599]
We propose a new type of globally defined statistical depth, based upon control theory and eikonal equations.
This depth is easy to interpret and compute, expressively captures multi-modal behavior, and extends naturally to data that is non-Euclidean.
arXiv Detail & Related papers (2022-01-14T01:57:48Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Understanding the Under-Coverage Bias in Uncertainty Estimation [58.03725169462616]
quantile regression tends to emphunder-cover than the desired coverage level in reality.
We prove that quantile regression suffers from an inherent under-coverage bias.
Our theory reveals that this under-coverage bias stems from a certain high-dimensional parameter estimation error.
arXiv Detail & Related papers (2021-06-10T06:11:55Z) - Depth-based pseudo-metrics between probability distributions [1.1470070927586016]
We propose two new pseudo-metrics between continuous probability measures based on data depth and its associated central regions.
In contrast to the Wasserstein distance, the proposed pseudo-metrics do not suffer from the curse of dimensionality.
The regions-based pseudo-metric appears to be robust w.r.t. both outliers and heavy tails.
arXiv Detail & Related papers (2021-03-23T17:33:18Z) - Analytic Characterization of the Hessian in Shallow ReLU Models: A Tale
of Symmetry [9.695960412426672]
We analytically characterize the Hessian at various families of spurious minima.
In particular, we prove that for $dge k$ standard Gaussian inputs: (a) of the $dk$ eigenvalues of the Hessian, $dk - O(d)$ concentrate near zero, (b) $Omega(d)$ of the eigenvalues grow linearly with $k$.
arXiv Detail & Related papers (2020-08-04T20:08:35Z) - A Concentration of Measure and Random Matrix Approach to Large
Dimensional Robust Statistics [45.24358490877106]
This article studies the emphrobust covariance matrix estimation of a data collection $X = (x_1,ldots,x_n)$ with $x_i = sqrt tau_i z_i + m$.
We exploit this semi-metric along with concentration of measure arguments to prove the existence and uniqueness of the robust estimator as well as evaluate its limiting spectral distribution.
arXiv Detail & Related papers (2020-06-17T09:02:26Z) - A Random Matrix Analysis of Random Fourier Features: Beyond the Gaussian
Kernel, a Precise Phase Transition, and the Corresponding Double Descent [85.77233010209368]
This article characterizes the exacts of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$ is all large and comparable.
This analysis also provides accurate estimates of training and test regression errors for large $n,p,N$.
arXiv Detail & Related papers (2020-06-09T02:05:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.