High-Dimensional Independence Testing via Maximum and Average Distance
Correlations
- URL: http://arxiv.org/abs/2001.01095v2
- Date: Mon, 5 Feb 2024 20:24:50 GMT
- Title: High-Dimensional Independence Testing via Maximum and Average Distance
Correlations
- Authors: Cencheng Shen, Yuexiao Dong
- Abstract summary: We characterize consistency properties in high-dimensional settings with respect to the number of marginally dependent dimensions.
We examine the advantages of each test statistic, examine their respective null distributions, and present a fast chi-square-based testing procedure.
- Score: 5.756296617325109
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces and investigates the utilization of maximum and average
distance correlations for multivariate independence testing. We characterize
their consistency properties in high-dimensional settings with respect to the
number of marginally dependent dimensions, assess the advantages of each test
statistic, examine their respective null distributions, and present a fast
chi-square-based testing procedure. The resulting tests are non-parametric and
applicable to both Euclidean distance and the Gaussian kernel as the underlying
metric. To better understand the practical use cases of the proposed tests, we
evaluate the empirical performance of the maximum distance correlation, average
distance correlation, and the original distance correlation across various
multivariate dependence scenarios, as well as conduct a real data experiment to
test the presence of various cancer types and peptide levels in human plasma.
Related papers
- Consistent Estimation of a Class of Distances Between Covariance Matrices [7.291687946822539]
We are interested in the family of distances that can be expressed as sums of traces of functions that are separately applied to each covariance matrix.
A statistical analysis of the behavior of this class of distance estimators has also been conducted.
We present a central limit theorem that establishes the Gaussianity of these estimators and provides closed form expressions for the corresponding means and variances.
arXiv Detail & Related papers (2024-09-18T07:36:25Z) - An Upper Confidence Bound Approach to Estimating the Maximum Mean [0.0]
We study estimation of the maximum mean using an upper confidence bound (UCB) approach.
We establish statistical guarantees, including strong consistency, mean squared errors, and central limit theorems (CLTs) for both estimators.
arXiv Detail & Related papers (2024-08-08T02:53:09Z) - Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.
An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Communication-Efficient Distributed Estimation and Inference for Cox's Model [4.731404257629232]
We develop communication-efficient iterative distributed algorithms for estimation and inference in the high-dimensional sparse Cox proportional hazards model.
To construct confidence intervals for linear combinations of high-dimensional hazard regression coefficients, we introduce a novel debiased method.
We provide valid and powerful distributed hypothesis tests for any coordinate element based on a decorrelated score test.
arXiv Detail & Related papers (2023-02-23T15:50:17Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - A Statistical Analysis of Summarization Evaluation Metrics using
Resampling Methods [60.04142561088524]
We find that the confidence intervals are rather wide, demonstrating high uncertainty in how reliable automatic metrics truly are.
Although many metrics fail to show statistical improvements over ROUGE, two recent works, QAEval and BERTScore, do in some evaluation settings.
arXiv Detail & Related papers (2021-03-31T18:28:14Z) - Cross-validation Confidence Intervals for Test Error [83.67415139421448]
This work develops central limit theorems for crossvalidation and consistent estimators of its variance under weak stability conditions on the learning algorithm.
Results are the first of their kind for the popular choice of leave-one-out cross-validation.
arXiv Detail & Related papers (2020-07-24T17:40:06Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Minimax Optimal Estimation of KL Divergence for Continuous Distributions [56.29748742084386]
Esting Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains.
One simple and effective estimator is based on the k nearest neighbor between these samples.
arXiv Detail & Related papers (2020-02-26T16:37:37Z) - The Chi-Square Test of Distance Correlation [7.748852202364896]
chi-square test is non-parametric, extremely fast, and applicable to bias-corrected distance correlation using any strong negative type metric or characteristic kernel.
We show that the underlying chi-square distribution well approximates and dominates the limiting null distribution in upper tail, prove the chi-square test can be valid and consistent for testing independence.
arXiv Detail & Related papers (2019-12-27T15:16:40Z) - Independence Testing for Temporal Data [14.25244839642841]
A fundamental question is whether two time-series are related or not.
Existing approaches often have limitations, such as relying on parametric assumptions.
This paper introduces the temporal dependence statistic with block permutation to test independence between temporal data.
arXiv Detail & Related papers (2019-08-18T17:19:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.