Deep Recursive Embedding for High-Dimensional Data
- URL: http://arxiv.org/abs/2111.00622v1
- Date: Sun, 31 Oct 2021 23:22:33 GMT
- Title: Deep Recursive Embedding for High-Dimensional Data
- Authors: Zixia Zhou, Xinrui Zu, Yuanyuan Wang, Boudewijn P.F. Lelieveldt, Qian
Tao
- Abstract summary: We propose to combine deep neural networks (DNN) with mathematics-guided embedding rules for high-dimensional data embedding.
We introduce a generic deep embedding network (DEN) framework, which is able to learn a parametric mapping from high-dimensional space to low-dimensional space.
- Score: 9.611123249318126
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Embedding high-dimensional data onto a low-dimensional manifold is of both
theoretical and practical value. In this paper, we propose to combine deep
neural networks (DNN) with mathematics-guided embedding rules for
high-dimensional data embedding. We introduce a generic deep embedding network
(DEN) framework, which is able to learn a parametric mapping from
high-dimensional space to low-dimensional space, guided by well-established
objectives such as Kullback-Leibler (KL) divergence minimization. We further
propose a recursive strategy, called deep recursive embedding (DRE), to make
use of the latent data representations for boosted embedding performance. We
exemplify the flexibility of DRE by different architectures and loss functions,
and benchmarked our method against the two most popular embedding methods,
namely, t-distributed stochastic neighbor embedding (t-SNE) and uniform
manifold approximation and projection (UMAP). The proposed DRE method can map
out-of-sample data and scale to extremely large datasets. Experiments on a
range of public datasets demonstrated improved embedding performance in terms
of local and global structure preservation, compared with other
state-of-the-art embedding methods.
Related papers
- Hyperboloid GPLVM for Discovering Continuous Hierarchies via Nonparametric Estimation [41.13597666007784]
Dimensionality reduction (DR) offers a useful representation of complex high-dimensional data.
Recent DR methods focus on hyperbolic geometry to derive a faithful low-dimensional representation of hierarchical data.
This paper presents hGP-LVMs to embed high-dimensional hierarchical data with implicit continuity via nonparametric estimation.
arXiv Detail & Related papers (2024-10-22T05:07:30Z) - Entropic Optimal Transport Eigenmaps for Nonlinear Alignment and Joint Embedding of High-Dimensional Datasets [11.105392318582677]
We propose a principled approach for aligning and jointly embedding a pair of datasets with theoretical guarantees.
Our approach leverages the leading singular vectors of the EOT plan matrix between two datasets to extract their shared underlying structure.
We show that in a high-dimensional regime, the EOT plan recovers the shared manifold structure by approximating a kernel function evaluated at the locations of the latent variables.
arXiv Detail & Related papers (2024-07-01T18:48:55Z) - Hierarchical Features Matter: A Deep Exploration of GAN Priors for Improved Dataset Distillation [51.44054828384487]
We propose a novel parameterization method dubbed Hierarchical Generative Latent Distillation (H-GLaD)
This method systematically explores hierarchical layers within the generative adversarial networks (GANs)
In addition, we introduce a novel class-relevant feature distance metric to alleviate the computational burden associated with synthetic dataset evaluation.
arXiv Detail & Related papers (2024-06-09T09:15:54Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - A Heat Diffusion Perspective on Geodesic Preserving Dimensionality
Reduction [66.21060114843202]
We propose a more general heat kernel based manifold embedding method that we call heat geodesic embeddings.
Results show that our method outperforms existing state of the art in preserving ground truth manifold distances.
We also showcase our method on single cell RNA-sequencing datasets with both continuum and cluster structure.
arXiv Detail & Related papers (2023-05-30T13:58:50Z) - Index $t$-SNE: Tracking Dynamics of High-Dimensional Datasets with
Coherent Embeddings [1.7188280334580195]
This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved.
The proposed algorithm has the same complexity as the original $t$-SNE to embed new items, and a lower one when considering the embedding of a dataset sliced into sub-pieces.
arXiv Detail & Related papers (2021-09-22T06:45:37Z) - Manifold Topology Divergence: a Framework for Comparing Data Manifolds [109.0784952256104]
We develop a framework for comparing data manifold, aimed at the evaluation of deep generative models.
Based on the Cross-Barcode, we introduce the Manifold Topology Divergence score (MTop-Divergence)
We demonstrate that the MTop-Divergence accurately detects various degrees of mode-dropping, intra-mode collapse, mode invention, and image disturbance.
arXiv Detail & Related papers (2021-06-08T00:30:43Z) - Deep Recursive Embedding for High-Dimensional Data [10.499461691493526]
We propose to combine the deep neural network (DNN) with the mathematical-grounded embedding rules for high-dimensional data embedding.
Our experiments demonstrated the excellent performance of the proposed Deep Recursive Embedding (DRE) on high-dimensional data embedding.
arXiv Detail & Related papers (2021-04-12T03:04:38Z) - Invertible Manifold Learning for Dimension Reduction [44.16432765844299]
Dimension reduction (DR) aims to learn low-dimensional representations of high-dimensional data with the preservation of essential information.
We propose a novel two-stage DR method, called invertible manifold learning (inv-ML) to bridge the gap between theoretical information-lossless and practical DR.
Experiments are conducted on seven datasets with a neural network implementation of inv-ML, called i-ML-Enc.
arXiv Detail & Related papers (2020-10-07T14:22:51Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z) - Two-Dimensional Semi-Nonnegative Matrix Factorization for Clustering [50.43424130281065]
We propose a new Semi-Nonnegative Matrix Factorization method for 2-dimensional (2D) data, named TS-NMF.
It overcomes the drawback of existing methods that seriously damage the spatial information of the data by converting 2D data to vectors in a preprocessing step.
arXiv Detail & Related papers (2020-05-19T05:54:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.