Learning 1-Dimensional Submanifolds for Subsequent Inference on Random
Dot Product Graphs
- URL: http://arxiv.org/abs/2004.07348v6
- Date: Fri, 24 Dec 2021 19:09:24 GMT
- Title: Learning 1-Dimensional Submanifolds for Subsequent Inference on Random
Dot Product Graphs
- Authors: Michael W. Trosset, Mingyue Gao, Minh Tang, Carey E. Priebe
- Abstract summary: manifold learning can be used to learn the unknown submanifold well enough to realize benefit from restricted inference.
We propose test statistics that deploy the Isomap procedure for manifold learning.
We also apply our methods to an inference problem that arises in studying the connectome of the Drosophila larval mushroom body.
- Score: 7.066109885246681
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A random dot product graph (RDPG) is a generative model for networks in which
vertices correspond to positions in a latent Euclidean space and edge
probabilities are determined by the dot products of the latent positions. We
consider RDPGs for which the latent positions are randomly sampled from an
unknown $1$-dimensional submanifold of the latent space. In principle,
restricted inference, i.e., procedures that exploit the structure of the
submanifold, should be more effective than unrestricted inference; however, it
is not clear how to conduct restricted inference when the submanifold is
unknown. We submit that techniques for manifold learning can be used to learn
the unknown submanifold well enough to realize benefit from restricted
inference. To illustrate, we test $1$- and $2$-sample hypotheses about the
Fr\'{e}chet means of small communities of vertices, using the complete set of
vertices to infer latent structure. We propose test statistics that deploy the
Isomap procedure for manifold learning, using shortest path distances on
neighborhood graphs constructed from estimated latent positions to estimate arc
lengths on the unknown $1$-dimensional submanifold. Unlike conventional
applications of Isomap, the estimated latent positions do not lie on the
submanifold of interest. We extend existing convergence results for Isomap to
this setting and use them to demonstrate that, as the number of auxiliary
vertices increases, the power of our test converges to the power of the
corresponding test when the submanifold is known. Finally, we apply our methods
to an inference problem that arises in studying the connectome of the
Drosophila larval mushroom body. The univariate learnt manifold test rejects
($p<0.05$), while the multivariate ambient space test does not ($p\gg0.05$),
illustrating the value of identifying and exploiting low-dimensional structure
for subsequent inference.
Related papers
- Don't Eliminate Cut: Exponential Separations in LLM-Based Theorem Proving [8.948475969696075]
We develop a theoretical analysis of LLM-guided formal theorem proving in interactive proof assistants (e.g., Lean)<n>To capture modern representation learning, we treat the state and action spaces as general compact metric spaces and assume Lipschitz policies.<n>Our main separation result shows that when cut elimination expands a DAG of depth $D$ into a cut-free tree of size $(D)$ while the cut-aware hierarchical process has size $O(D)$ with $ll$, a flat learner provably requires exponentially more data than a cut-aware hierarchical learner
arXiv Detail & Related papers (2026-02-11T04:24:09Z) - The informativeness of the gradient revisited [4.178980693837599]
We give a general bound on the variance in terms of a parameter related to the pairwise independence of the target function class.<n>In addition to the theoretical analysis, we present experiments to understand better the nature of recent deep learning-based attacks on Learning with Errors.
arXiv Detail & Related papers (2025-05-28T09:23:37Z) - Recovering Manifold Structure Using Ollivier-Ricci Curvature [1.9458156037869137]
We introduce ORC-ManL, a new algorithm to prune spurious edges from nearest neighbor graphs using a criterion based on Ollivier-Ricci curvature and estimated metric distortion.
Our motivation comes from manifold learning: we show that when the data generating the nearest-neighbor graph consists of noisy samples from a low-dimensional manifold, edges that shortcut through the ambient space have more negative Ollivier-Ricci curvature than edges that lie along the data manifold.
arXiv Detail & Related papers (2024-10-02T01:00:30Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Sampling and estimation on manifolds using the Langevin diffusion [45.57801520690309]
Two estimators of linear functionals of $mu_phi $ based on the discretized Markov process are considered.
Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion.
arXiv Detail & Related papers (2023-12-22T18:01:11Z) - Sampling in Constrained Domains with Orthogonal-Space Variational
Gradient Descent [13.724361914659438]
We propose a new variational framework with a designed orthogonal-space gradient flow (O-Gradient) for sampling on a manifold.
We prove that O-Gradient converges to the target constrained distribution with rate $widetildeO (1/textthe number of iterations)$ under mild conditions.
arXiv Detail & Related papers (2022-10-12T17:51:13Z) - Combating Mode Collapse in GANs via Manifold Entropy Estimation [70.06639443446545]
Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications.
We propose a novel training pipeline to address the mode collapse issue of GANs.
arXiv Detail & Related papers (2022-08-25T12:33:31Z) - Koopman Methods for Estimation of Animal Motions over Unknown, Regularly
Embedded Submanifolds [0.9558392439655015]
This paper introduces a method to estimate forward kinematics from the unknown configuration submanifold $Q$ to an $n$-dimensional Euclidean space $Y:=mathbbRn$ of observations.
We show that the derived rates of convergence can be applied to estimates generated by the extended dynamic mode decomposition (EDMD) method.
arXiv Detail & Related papers (2022-03-10T21:20:19Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic
Approach to Manifold Dimension Estimation [92.81218653234669]
We present new approach to manifold hypothesis checking and underlying manifold dimension estimation.
Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation.
Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
arXiv Detail & Related papers (2021-07-08T15:35:54Z) - q-Paths: Generalizing the Geometric Annealing Path using Power Means [51.73925445218366]
We introduce $q$-paths, a family of paths which includes the geometric and arithmetic mixtures as special cases.
We show that small deviations away from the geometric path yield empirical gains for Bayesian inference.
arXiv Detail & Related papers (2021-07-01T21:09:06Z) - Lattice partition recovery with dyadic CART [79.96359947166592]
We study piece-wise constant signals corrupted by additive Gaussian noise over a $d$-dimensional lattice.
Data of this form naturally arise in a host of applications, and the tasks of signal detection or testing, de-noising and estimation have been studied extensively in the statistical and signal processing literature.
In this paper we consider instead the problem of partition recovery, i.e.of estimating the partition of the lattice induced by the constancy regions of the unknown signal.
We prove that a DCART-based procedure consistently estimates the underlying partition at a rate of order $sigma2 k*
arXiv Detail & Related papers (2021-05-27T23:41:01Z) - Manifold Fitting under Unbounded Noise [4.54773250519101]
We introduce a new manifold-fitting method, by which the output manifold is constructed by directly estimating the tangent spaces at the projected points on the underlying manifold.
Our new method provides theoretical convergence in high probability, in terms of the upper bound of the distance between the estimated and underlying manifold.
arXiv Detail & Related papers (2019-09-23T08:55:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.