Consistent Estimation of a Class of Distances Between Covariance Matrices
- URL: http://arxiv.org/abs/2409.11761v1
- Date: Wed, 18 Sep 2024 07:36:25 GMT
- Title: Consistent Estimation of a Class of Distances Between Covariance Matrices
- Authors: Roberto Pereira, Xavier Mestre, Davig Gregoratti,
- Abstract summary: We are interested in the family of distances that can be expressed as sums of traces of functions that are separately applied to each covariance matrix.
A statistical analysis of the behavior of this class of distance estimators has also been conducted.
We present a central limit theorem that establishes the Gaussianity of these estimators and provides closed form expressions for the corresponding means and variances.
- Score: 7.291687946822539
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This work considers the problem of estimating the distance between two covariance matrices directly from the data. Particularly, we are interested in the family of distances that can be expressed as sums of traces of functions that are separately applied to each covariance matrix. This family of distances is particularly useful as it takes into consideration the fact that covariance matrices lie in the Riemannian manifold of positive definite matrices, thereby including a variety of commonly used metrics, such as the Euclidean distance, Jeffreys' divergence, and the log-Euclidean distance. Moreover, a statistical analysis of the asymptotic behavior of this class of distance estimators has also been conducted. Specifically, we present a central limit theorem that establishes the asymptotic Gaussianity of these estimators and provides closed form expressions for the corresponding means and variances. Empirical evaluations demonstrate the superiority of our proposed consistent estimator over conventional plug-in estimators in multivariate analytical contexts. Additionally, the central limit theorem derived in this study provides a robust statistical framework to assess of accuracy of these estimators.
Related papers
- Statistical Framework for Clustering MU-MIMO Wireless via Second Order Statistics [8.195126516665914]
We consider an estimator of the Log-Euclidean distance between multiple sample covariance matrices (SCMs) consistent when the number of samples and the observation size grow unbounded at the same rate.
We develop a statistical framework that allows accurate predictions of the clustering algorithm's performance under realistic conditions.
arXiv Detail & Related papers (2024-08-08T14:23:06Z) - A Geometric Unification of Distributionally Robust Covariance Estimators: Shrinking the Spectrum by Inflating the Ambiguity Set [20.166217494056916]
We propose a principled approach to construct covariance estimators without imposing restrictive assumptions.
We show that our robust estimators are efficiently computable and consistent.
Numerical experiments based on synthetic and real data show that our robust estimators are competitive with state-of-the-art estimators.
arXiv Detail & Related papers (2024-05-30T15:01:18Z) - A Uniform Concentration Inequality for Kernel-Based Two-Sample Statistics [4.757470449749877]
We show that these metrics can be unified under a general framework of kernel-based two-sample statistics.
This paper establishes a novel uniform concentration inequality for the aforementioned kernel-based statistics.
As illustrative applications, we demonstrate how these bounds facilitate the component of error bounds for procedures such as distance covariance-based dimension reduction.
arXiv Detail & Related papers (2024-05-22T22:41:56Z) - Intrinsic Bayesian Cramér-Rao Bound with an Application to Covariance Matrix Estimation [49.67011673289242]
This paper presents a new performance bound for estimation problems where the parameter to estimate lies in a smooth manifold.
It induces a geometry for the parameter manifold, as well as an intrinsic notion of the estimation error measure.
arXiv Detail & Related papers (2023-11-08T15:17:13Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - Enriching Disentanglement: From Logical Definitions to Quantitative Metrics [59.12308034729482]
Disentangling the explanatory factors in complex data is a promising approach for data-efficient representation learning.
We establish relationships between logical definitions and quantitative metrics to derive theoretically grounded disentanglement metrics.
We empirically demonstrate the effectiveness of the proposed metrics by isolating different aspects of disentangled representations.
arXiv Detail & Related papers (2023-05-19T08:22:23Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Minimax Optimal Estimation of KL Divergence for Continuous Distributions [56.29748742084386]
Esting Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains.
One simple and effective estimator is based on the k nearest neighbor between these samples.
arXiv Detail & Related papers (2020-02-26T16:37:37Z) - Finite sample properties of parametric MMD estimation: robustness to misspecification and dependence [7.011897575776511]
We show that the estimator is robust to both dependence and to the presence of outliers in the dataset.
We provide a theoretical study of the gradient descent algorithm used to compute the estimator.
arXiv Detail & Related papers (2019-12-12T02:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.