A note on estimating the dimension from a random geometric graph
- URL: http://arxiv.org/abs/2311.13059v1
- Date: Tue, 21 Nov 2023 23:46:44 GMT
- Title: A note on estimating the dimension from a random geometric graph
- Authors: Caelan Atamanchuk and Luc Devroye and Gabor Lugosi
- Abstract summary: We study the problem of estimating the dimension $d$ of the underlying space when we have access to the adjacency matrix of the graph.
We also show that, without any condition on the density, a consistent estimator of $d$ exists when $n r_nd to infty$ and $r_n = o(1)$.
- Score: 2.3020018305241337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Let $G_n$ be a random geometric graph with vertex set $[n]$ based on $n$
i.i.d.\ random vectors $X_1,\ldots,X_n$ drawn from an unknown density $f$ on
$\R^d$. An edge $(i,j)$ is present when $\|X_i -X_j\| \le r_n$, for a given
threshold $r_n$ possibly depending upon $n$, where $\| \cdot \|$ denotes
Euclidean distance. We study the problem of estimating the dimension $d$ of the
underlying space when we have access to the adjacency matrix of the graph but
do not know $r_n$ or the vectors $X_i$. The main result of the paper is that
there exists an estimator of $d$ that converges to $d$ in probability as $n \to
\infty$ for all densities with $\int f^5 < \infty$ whenever $n^{3/2} r_n^d \to
\infty$ and $r_n = o(1)$. The conditions allow very sparse graphs since when
$n^{3/2} r_n^d \to 0$, the graph contains isolated edges only, with high
probability. We also show that, without any condition on the density, a
consistent estimator of $d$ exists when $n r_n^d \to \infty$ and $r_n = o(1)$.
Related papers
- Improved convergence rate of kNN graph Laplacians [11.93971616098517]
General class of $k$NN graph where the graph affinity is $W_ij = epsilon-d/2 .
We prove the point-wise convergence of the $k$NN graph Laplacian to the limiting manifold operator.
arXiv Detail & Related papers (2024-10-30T17:01:00Z) - Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - Detection of Dense Subhypergraphs by Low-Degree Polynomials [72.4451045270967]
Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem.
We consider detecting the presence of a planted $Gr(ngamma, n-alpha)$ subhypergraph in a $Gr(n, n-beta) hypergraph.
Our results are already new in the graph case $r=2$, as we consider the subtle log-density regime where hardness based on average-case reductions is not known.
arXiv Detail & Related papers (2023-04-17T10:38:08Z) - A spectral least-squares-type method for heavy-tailed corrupted
regression with unknown covariance \& heterogeneous noise [2.019622939313173]
We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted $n$-sized label-feature sample of at most $epsilon n$ arbitrary outliers.
We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on $(Sigma,Xi) nor the operator norm of $Xi$.
arXiv Detail & Related papers (2022-09-06T23:37:31Z) - Low-Rank Approximation with $1/\epsilon^{1/3}$ Matrix-Vector Products [58.05771390012827]
We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-$p$ norm.
Our main result is an algorithm that uses only $tildeO(k/sqrtepsilon)$ matrix-vector products.
arXiv Detail & Related papers (2022-02-10T16:10:41Z) - Exact Matching of Random Graphs with Constant Correlation [2.578242050187029]
This paper deals with the problem of graph matching or network alignment for ErdHos--R'enyi graphs.
It can be viewed as a noisy average-case version of the graph isomorphism problem.
arXiv Detail & Related papers (2021-10-11T05:07:50Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - The planted matching problem: Sharp threshold and infinite-order phase
transition [25.41713098167692]
We study the problem of reconstructing a perfect matching $M*$ hidden in a randomly weighted $ntimes n$ bipartite graph.
We show that if $sqrtd B(mathcalP,mathcalQ) ge 1+epsilon$ for an arbitrarily small constant $epsilon>0$, the reconstruction error for any estimator is shown to be bounded away from $0$.
arXiv Detail & Related papers (2021-03-17T00:59:33Z) - Sparse sketches with small inversion bias [79.77110958547695]
Inversion bias arises when averaging estimates of quantities that depend on the inverse covariance.
We develop a framework for analyzing inversion bias, based on our proposed concept of an $(epsilon,delta)$-unbiased estimator for random matrices.
We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, the estimator $(epsilon,delta)$-unbiased for $(Atop A)-1$ with a sketch of size $m=O(d+sqrt d/
arXiv Detail & Related papers (2020-11-21T01:33:15Z) - Optimal Coreset for Gaussian Kernel Density Estimation [0.8376091455761259]
Given a point set $Psubset mathbbRd$, the kernel density estimate of $P$ is defined as [ overlinemathcalG_P(x) = frac1left|Pright|sum_pin Pe-leftlVert x-p rightrVert2 ] for any $xinmathbbRd$.
We study how to construct a small subset $Q$ of $P
arXiv Detail & Related papers (2020-07-15T22:58:50Z) - Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss [76.02734481158458]
It is known that in the worst case, to obtain a good rank-$k$ approximation to a matrix, one needs an arbitrarily large $nOmega(1)$ number of columns.
We show that under certain minimal and realistic distributional settings, it is possible to obtain a $(k/epsilon)$-approximation with a nearly linear running time and poly$(k/epsilon)+O(klog n)$ columns.
This is the first algorithm of any kind for achieving a $(k/epsilon)$-approximation for entrywise
arXiv Detail & Related papers (2020-04-16T22:57:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.