Hypothesis Testing for Equality of Latent Positions in Random Graphs
- URL: http://arxiv.org/abs/2105.10838v1
- Date: Sun, 23 May 2021 01:27:23 GMT
- Title: Hypothesis Testing for Equality of Latent Positions in Random Graphs
- Authors: Xinjie Du, Minh Tang
- Abstract summary: We consider the hypothesis testing problem that two vertices $i$ and $j$th have the same latent positions, possibly up to scaling.
We propose several test statistics based on the empirical Mahalanobis distances between the $i$th and $j$th rows of either the adjacency or the normalized Laplacian spectral embedding of the graph.
Using these test statistics, we address the model selection problem of choosing between the standard block model and its degree-corrected variant.
- Score: 0.2741266294612775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the hypothesis testing problem that two vertices $i$ and $j$ of a
generalized random dot product graph have the same latent positions, possibly
up to scaling. Special cases of this hypotheses test include testing whether
two vertices in a stochastic block model or degree-corrected stochastic block
model graph have the same block membership vectors. We propose several test
statistics based on the empirical Mahalanobis distances between the $i$th and
$j$th rows of either the adjacency or the normalized Laplacian spectral
embedding of the graph. We show that, under mild conditions, these test
statistics have limiting chi-square distributions under both the null and local
alternative hypothesis, and we derived explicit expressions for the
non-centrality parameters under the local alternative. Using these limit
results, we address the model selection problem of choosing between the
standard stochastic block model and its degree-corrected variant. The
effectiveness of our proposed tests are illustrated via both simulation studies
and real data applications.
Related papers
- Testing Dependency of Weighted Random Graphs [4.0554893636822]
We study the task of detecting the edge dependency between two random graphs.
For general edge-weight distributions, we establish thresholds at which optimal testing becomes information-theoretically possible or impossible.
arXiv Detail & Related papers (2024-09-23T10:07:41Z) - Doubly Robust Conditional Independence Testing with Generative Neural Networks [8.323172773256449]
This article addresses the problem of testing the conditional independence of two generic random vectors $X$ and $Y$ given a third random vector $Z$.
We propose a new non-parametric testing procedure that avoids explicitly estimating any conditional distributions.
arXiv Detail & Related papers (2024-07-25T01:28:59Z) - Collaborative non-parametric two-sample testing [55.98760097296213]
The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected.
We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure.
Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning.
arXiv Detail & Related papers (2024-02-08T14:43:56Z) - Statistical Properties of the Entropy from Ordinal Patterns [55.551675080361335]
Knowing the joint distribution of the pair Entropy-Statistical Complexity for a large class of time series models would allow statistical tests that are unavailable to date.
We characterize the distribution of the empirical Shannon's Entropy for any model under which the true normalized Entropy is neither zero nor one.
We present a bilateral test that verifies if there is enough evidence to reject the hypothesis that two signals produce ordinal patterns with the same Shannon's Entropy.
arXiv Detail & Related papers (2022-09-15T23:55:58Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - Least Squares Estimation Using Sketched Data with Heteroskedastic Errors [0.0]
We show that estimates using data sketched by random projections will behave as if the errors were homoskedastic.
Inference, including first-stage F tests for instrument relevance, can be simpler than the full sample case if the sketching scheme is appropriately chosen.
arXiv Detail & Related papers (2020-07-15T15:58:27Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - A Causal Direction Test for Heterogeneous Populations [10.653162005300608]
Most causal models assume a single homogeneous population, an assumption that may fail to hold in many applications.
We show that when the homogeneity assumption is violated, causal models developed based on such assumption can fail to identify the correct causal direction.
We propose an adjustment to a commonly used causal direction test statistic by using a $k$-means type clustering algorithm.
arXiv Detail & Related papers (2020-06-08T18:59:14Z) - Selective Inference for Latent Block Models [50.83356836818667]
This study provides a selective inference method for latent block models.
We construct a statistical test on a set of row and column cluster memberships of a latent block model.
The proposed exact and approximated tests work effectively, compared to the naive test that did not take the selective bias into account.
arXiv Detail & Related papers (2020-05-27T10:44:19Z) - Testing Goodness of Fit of Conditional Density Models with Kernels [16.003516725803774]
We propose two nonparametric statistical tests of goodness of fit for conditional distributions.
We show that our tests are consistent against any fixed alternative conditional model.
We demonstrate the interpretability of our test on a task of modeling the distribution of New York City's taxi drop-off location.
arXiv Detail & Related papers (2020-02-24T14:04:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.