Detection of Correlated Random Vectors
- URL: http://arxiv.org/abs/2401.13429v3
- Date: Thu, 25 Jul 2024 10:15:51 GMT
- Title: Detection of Correlated Random Vectors
- Authors: Dor Elimelech, Wasim Huleihel,
- Abstract summary: Two random vectors $mathsfXinmathbbRn$ and $mathsfYinmathbbRn$ are correlated or not.
We analyze the thresholds at which optimal testing is information-theoretically impossible and possible.
- Score: 7.320365821066746
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we investigate the problem of deciding whether two standard normal random vectors $\mathsf{X}\in\mathbb{R}^{n}$ and $\mathsf{Y}\in\mathbb{R}^{n}$ are correlated or not. This is formulated as a hypothesis testing problem, where under the null hypothesis, these vectors are statistically independent, while under the alternative, $\mathsf{X}$ and a randomly and uniformly permuted version of $\mathsf{Y}$, are correlated with correlation $\rho$. We analyze the thresholds at which optimal testing is information-theoretically impossible and possible, as a function of $n$ and $\rho$. To derive our information-theoretic lower bounds, we develop a novel technique for evaluating the second moment of the likelihood ratio using an orthogonal polynomials expansion, which among other things, reveals a surprising connection to integer partition functions. We also study a multi-dimensional generalization of the above setting, where rather than two vectors we observe two databases/matrices, and furthermore allow for partial correlations between these two.
Related papers
- A computational transition for detecting correlated stochastic block models by low-degree polynomials [13.396246336911842]
Detection of correlation in a pair of random graphs is a fundamental statistical and computational problem that has been extensively studied in recent years.
We consider a pair of correlated block models $mathcalS(n,tfraclambdan;k,epsilon;s)$ that are subsampled from a common parent block model $mathcal S(n,tfraclambdan;k,epsilon;s)$
We focus on tests based on emphlow-degrees of the entries of the adjacency
arXiv Detail & Related papers (2024-09-02T06:14:05Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ under isotropic Gaussian data.
We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ of arbitrary link function with a sample and runtime complexity of $n asymp T asymp C(q) cdot d
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - On Ranking-based Tests of Independence [0.0]
We develop a novel nonparametric framework to test the independence of two random variables $mathbfX$ and $mathbfY$.
We consider a wide class of rank statistics encompassing many ways of deviating from the diagonal in the ROC space to build tests of independence.
arXiv Detail & Related papers (2024-03-12T10:00:00Z) - Testing Dependency of Unlabeled Databases [5.384630221560811]
Two random databases $mathsfXinmathcalXntimes d$ and $mathsfYinmathcalYntimes d$ are statistically dependent or not.
We characterize the thresholds at which optimal testing is information-theoretically impossible and possible.
arXiv Detail & Related papers (2023-11-10T05:17:03Z) - Statistical Learning under Heterogeneous Distribution Shift [71.8393170225794]
Ground-truth predictor is additive $mathbbE[mathbfz mid mathbfx,mathbfy] = f_star(mathbfx) +g_star(mathbfy)$.
arXiv Detail & Related papers (2023-02-27T16:34:21Z) - Universality of empirical risk minimization [12.764655736673749]
Consider supervised learning from i.i.d. samples where $boldsymbol x_i inmathbbRp$ are feature vectors and $y in mathbbR$ are labels.
We study empirical risk universality over a class of functions that are parameterized by $mathsfk.
arXiv Detail & Related papers (2022-02-17T18:53:45Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Feature Cross Search via Submodular Optimization [58.15569071608769]
We study feature cross search as a fundamental primitive in feature engineering.
We show that there exists a simple greedy $(1-1/e)$-approximation algorithm for this problem.
arXiv Detail & Related papers (2021-07-05T16:58:31Z) - Optimal Spectral Recovery of a Planted Vector in a Subspace [80.02218763267992]
We study efficient estimation and detection of a planted vector $v$ whose $ell_4$ norm differs from that of a Gaussian vector with the same $ell$ norm.
We show that in the regime $n rho gg sqrtN$, any spectral method from a large class (and more generally, any low-degree of the input) fails to detect the planted vector.
arXiv Detail & Related papers (2021-05-31T16:10:49Z) - Universal Regular Conditional Distributions via Probability
Measure-Valued Deep Neural Models [3.8073142980733]
We find that any model built using the proposed framework is dense in the space $C(mathcalX,mathcalP_1(mathcalY))$.
The proposed models are also shown to be capable of generically expressing the aleatoric uncertainty present in most randomized machine learning models.
arXiv Detail & Related papers (2021-05-17T11:34:09Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.