Statistical embedding: Beyond principal components
- URL: http://arxiv.org/abs/2106.01858v1
- Date: Thu, 3 Jun 2021 14:01:21 GMT
- Title: Statistical embedding: Beyond principal components
- Authors: Dag Tj{\o}stheim and Martin Jullum and Anders L{\o}land
- Abstract summary: Three methods are presented: $t$-SNE, UMAP and LargeVis based on methods in parts one, two and three, respectively.
The methods are illustrated and compared on two simulated data sets; one consisting of a triple of noisy Ranunculoid curves, and one consisting of networks of increasing complexity and with two types of nodes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been an intense recent activity in embedding of very high
dimensional and nonlinear data structures, much of it in the data science and
machine learning literature. We survey this activity in four parts. In the
first part we cover nonlinear methods such as principal curves,
multidimensional scaling, local linear methods, ISOMAP, graph based methods and
kernel based methods. The second part is concerned with topological embedding
methods, in particular mapping topological properties into persistence
diagrams. Another type of data sets with a tremendous growth is very
high-dimensional network data. The task considered in part three is how to
embed such data in a vector space of moderate dimension to make the data
amenable to traditional techniques such as cluster and classification
techniques. The final part of the survey deals with embedding in
$\mathbb{R}^2$, which is visualization. Three methods are presented: $t$-SNE,
UMAP and LargeVis based on methods in parts one, two and three, respectively.
The methods are illustrated and compared on two simulated data sets; one
consisting of a triple of noisy Ranunculoid curves, and one consisting of
networks of increasing complexity and with two types of nodes.
Related papers
- Dissecting embedding method: learning higher-order structures from data [0.0]
Geometric deep learning methods for data learning often include set of assumptions on the geometry of the feature space.
These assumptions together with data being discrete and finite can cause some generalisations, which are likely to create wrong interpretations of the data and models outputs.
arXiv Detail & Related papers (2024-10-14T08:19:39Z) - A Closer Look at Deep Learning on Tabular Data [52.50778536274327]
Tabular data is prevalent across various domains in machine learning.
Deep Neural Network (DNN)-based methods have shown promising performance comparable to tree-based ones.
arXiv Detail & Related papers (2024-07-01T04:24:07Z) - Self-Supervised Learning for Multimodal Non-Rigid 3D Shape Matching [15.050801537501462]
We introduce a self-supervised multimodal learning strategy that combines mesh-based functional map regularisation with a contrastive loss that couples mesh and point cloud data.
Our shape matching approach allows to obtain intramodal correspondences for triangle meshes, complete point clouds, and partially observed point clouds.
We demonstrate that our method achieves state-of-the-art results on several challenging benchmark datasets.
arXiv Detail & Related papers (2023-03-20T09:47:02Z) - Geometry Contrastive Learning on Heterogeneous Graphs [50.58523799455101]
This paper proposes a novel self-supervised learning method, termed as Geometry Contrastive Learning (GCL)
GCL views a heterogeneous graph from Euclidean and hyperbolic perspective simultaneously, aiming to make a strong merger of the ability of modeling rich semantics and complex structures.
Extensive experiments on four benchmarks data sets show that the proposed approach outperforms the strong baselines.
arXiv Detail & Related papers (2022-06-25T03:54:53Z) - A Topological Approach for Semi-Supervised Learning [0.0]
We present new semi-supervised learning methods based on techniques from Topological Data Analysis (TDA)
In particular, we have created two semi-supervised learning methods following two different topological approaches.
The results show that the methods developed in this work outperform both the results obtained with models trained with only manually labelled data, and those obtained with classical semi-supervised learning methods.
arXiv Detail & Related papers (2022-05-19T15:23:39Z) - Index $t$-SNE: Tracking Dynamics of High-Dimensional Datasets with
Coherent Embeddings [1.7188280334580195]
This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved.
The proposed algorithm has the same complexity as the original $t$-SNE to embed new items, and a lower one when considering the embedding of a dataset sliced into sub-pieces.
arXiv Detail & Related papers (2021-09-22T06:45:37Z) - Manifold Topology Divergence: a Framework for Comparing Data Manifolds [109.0784952256104]
We develop a framework for comparing data manifold, aimed at the evaluation of deep generative models.
Based on the Cross-Barcode, we introduce the Manifold Topology Divergence score (MTop-Divergence)
We demonstrate that the MTop-Divergence accurately detects various degrees of mode-dropping, intra-mode collapse, mode invention, and image disturbance.
arXiv Detail & Related papers (2021-06-08T00:30:43Z) - A Local Similarity-Preserving Framework for Nonlinear Dimensionality
Reduction with Neural Networks [56.068488417457935]
We propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction.
To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points.
Experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test.
arXiv Detail & Related papers (2021-03-10T23:10:47Z) - TSGCNet: Discriminative Geometric Feature Learning with Two-Stream
GraphConvolutional Network for 3D Dental Model Segmentation [141.2690520327948]
We propose a two-stream graph convolutional network (TSGCNet) to learn multi-view information from different geometric attributes.
We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners.
arXiv Detail & Related papers (2020-12-26T08:02:56Z) - Primal-Dual Mesh Convolutional Neural Networks [62.165239866312334]
We propose a primal-dual framework drawn from the graph-neural-network literature to triangle meshes.
Our method takes features for both edges and faces of a 3D mesh as input and dynamically aggregates them.
We provide theoretical insights of our approach using tools from the mesh-simplification literature.
arXiv Detail & Related papers (2020-10-23T14:49:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.