Manifold learning with arbitrary norms
- URL: http://arxiv.org/abs/2012.14172v1
- Date: Mon, 28 Dec 2020 10:24:30 GMT
- Title: Manifold learning with arbitrary norms
- Authors: Joe Kileel, Amit Moscovich, Nathan Zelesko, Amit Singer
- Abstract summary: We show that manifold learning based on Earthmover's distances outperforms the standard Euclidean variant for learning molecular shape spaces.
We show in a numerical simulation that manifold learning based on Earthmover's distances outperforms the standard Euclidean variant for learning molecular shape spaces.
- Score: 8.433233101044197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Manifold learning methods play a prominent role in nonlinear dimensionality
reduction and other tasks involving high-dimensional data sets with low
intrinsic dimensionality. Many of these methods are graph-based: they associate
a vertex with each data point and a weighted edge between each pair of close
points. Existing theory shows, under certain conditions, that the Laplacian
matrix of the constructed graph converges to the Laplace-Beltrami operator of
the data manifold. However, this result assumes the Euclidean norm is used for
measuring distances. In this paper, we determine the limiting differential
operator for graph Laplacians constructed using $\textit{any}$ norm. The proof
involves a subtle interplay between the second fundamental form of the
underlying manifold and the convex geometry of the norm's unit ball. To
motivate the use of non-Euclidean norms, we show in a numerical simulation that
manifold learning based on Earthmover's distances outperforms the standard
Euclidean variant for learning molecular shape spaces, in terms of both sample
complexity and computational complexity.
Related papers
- Reconstructing the Geometry of Random Geometric Graphs [9.004991291124096]
Random geometric graphs are random graph models defined on metric spaces.
We show how to efficiently reconstruct the geometry of the underlying space from the sampled graph.
arXiv Detail & Related papers (2024-02-14T21:34:44Z) - Improving embedding of graphs with missing data by soft manifolds [51.425411400683565]
The reliability of graph embeddings depends on how much the geometry of the continuous space matches the graph structure.
We introduce a new class of manifold, named soft manifold, that can solve this situation.
Using soft manifold for graph embedding, we can provide continuous spaces to pursue any task in data analysis over complex datasets.
arXiv Detail & Related papers (2023-11-29T12:48:33Z) - Tight and fast generalization error bound of graph embedding in metric
space [54.279425319381374]
We show that graph embedding in non-Euclidean metric space can outperform that in Euclidean space with much smaller training data than the existing bound has suggested.
Our new upper bound is significantly tighter and faster than the existing one, which can be exponential to $R$ and $O(frac1S)$ at the fastest.
arXiv Detail & Related papers (2023-05-13T17:29:18Z) - Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic
Approach to Manifold Dimension Estimation [92.81218653234669]
We present new approach to manifold hypothesis checking and underlying manifold dimension estimation.
Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation.
Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
arXiv Detail & Related papers (2021-07-08T15:35:54Z) - A Local Similarity-Preserving Framework for Nonlinear Dimensionality
Reduction with Neural Networks [56.068488417457935]
We propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction.
To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points.
Experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test.
arXiv Detail & Related papers (2021-03-10T23:10:47Z) - Linear Classifiers in Mixed Constant Curvature Spaces [40.82908295137667]
We address the problem of linear classification in a product space form -- a mix of Euclidean, spherical, and hyperbolic spaces.
We prove that linear classifiers in $d$-dimensional constant curvature spaces can shatter exactly $d+1$ points.
We describe a novel perceptron classification algorithm, and establish rigorous convergence results.
arXiv Detail & Related papers (2021-02-19T23:29:03Z) - Survey: Geometric Foundations of Data Reduction [2.238700807267101]
The purpose of this survey is to briefly introduce nonlinear dimensionality reduction (NLDR) in data reduction.
In 2001, the concept of Manifold Learning first appears as an NLDR method called Laplacian Eigenmaps.
We derive each spectral manifold learning with the matrix and operator representation, and we then discuss the convergence behavior of each method in a geometric uniform language.
arXiv Detail & Related papers (2020-08-16T07:59:22Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z) - Ultrahyperbolic Representation Learning [13.828165530602224]
In machine learning, data is usually represented in a (flat) Euclidean space where distances between points are along straight lines.
We propose a representation living on a pseudo-Riemannian manifold of constant nonzero curvature.
We provide the necessary learning tools in this geometry and extend gradient-based optimization techniques.
arXiv Detail & Related papers (2020-07-01T03:49:24Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.