Parametric UMAP embeddings for representation and semi-supervised
learning
- URL: http://arxiv.org/abs/2009.12981v4
- Date: Sun, 29 Aug 2021 16:17:30 GMT
- Title: Parametric UMAP embeddings for representation and semi-supervised
learning
- Authors: Tim Sainburg, Leland McInnes, Timothy Q Gentner
- Abstract summary: UMAP is a non-parametric graph-based dimensionality reduction algorithm to find low-dimensional embeddings of structured data.
We show that Parametric UMAP performs comparably to its non-parametric counterpart while conferring the benefit of a learned parametric mapping.
We then explore UMAP as a regularization, constraining the latent distribution of autoencoders, parametrically varying global structure preservation, and improving classifier accuracy for semi-supervised learning.
- Score: 0.03823356975862005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: UMAP is a non-parametric graph-based dimensionality reduction algorithm using
applied Riemannian geometry and algebraic topology to find low-dimensional
embeddings of structured data. The UMAP algorithm consists of two steps: (1)
Compute a graphical representation of a dataset (fuzzy simplicial complex), and
(2) Through stochastic gradient descent, optimize a low-dimensional embedding
of the graph. Here, we extend the second step of UMAP to a parametric
optimization over neural network weights, learning a parametric relationship
between data and embedding. We first demonstrate that Parametric UMAP performs
comparably to its non-parametric counterpart while conferring the benefit of a
learned parametric mapping (e.g. fast online embeddings for new data). We then
explore UMAP as a regularization, constraining the latent distribution of
autoencoders, parametrically varying global structure preservation, and
improving classifier accuracy for semi-supervised learning by capturing
structure in unlabeled data. Google Colab walkthrough:
https://colab.research.google.com/drive/1WkXVZ5pnMrm17m0YgmtoNjM_XHdnE5Vp?usp=sharing
Related papers
- Hyperboloid GPLVM for Discovering Continuous Hierarchies via Nonparametric Estimation [41.13597666007784]
Dimensionality reduction (DR) offers a useful representation of complex high-dimensional data.
Recent DR methods focus on hyperbolic geometry to derive a faithful low-dimensional representation of hierarchical data.
This paper presents hGP-LVMs to embed high-dimensional hierarchical data with implicit continuity via nonparametric estimation.
arXiv Detail & Related papers (2024-10-22T05:07:30Z) - Automatic Parameterization for Aerodynamic Shape Optimization via Deep
Geometric Learning [60.69217130006758]
We propose two deep learning models that fully automate shape parameterization for aerodynamic shape optimization.
Both models are optimized to parameterize via deep geometric learning to embed human prior knowledge into learned geometric patterns.
We perform shape optimization experiments on 2D airfoils and discuss the applicable scenarios for the two models.
arXiv Detail & Related papers (2023-05-03T13:45:40Z) - Efficient Parametric Approximations of Neural Network Function Space
Distance [6.117371161379209]
It is often useful to compactly summarize important properties of model parameters and training data so that they can be used later without storing and/or iterating over the entire dataset.
We consider estimating the Function Space Distance (FSD) over a training set, i.e. the average discrepancy between the outputs of two neural networks.
We propose a Linearized Activation TRick (LAFTR) and derive an efficient approximation to FSD for ReLU neural networks.
arXiv Detail & Related papers (2023-02-07T15:09:23Z) - One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive
Least-Squares [8.443742714362521]
We develop an algorithm for one-pass learning which seeks to perfectly fit every new datapoint while changing the parameters in a direction that causes the least change to the predictions on previous datapoints.
Our algorithm uses the memory efficiently by exploiting the structure of the streaming data via an incremental principal component analysis (IPCA)
Our experiments show the effectiveness of the proposed method compared to the baselines.
arXiv Detail & Related papers (2022-07-28T02:01:31Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - Condensing Graphs via One-Step Gradient Matching [50.07587238142548]
We propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights.
Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs.
In particular, we are able to reduce the dataset size by 90% while approximating up to 98% of the original performance.
arXiv Detail & Related papers (2022-06-15T18:20:01Z) - Hyperbolic Vision Transformers: Combining Improvements in Metric
Learning [116.13290702262248]
We propose a new hyperbolic-based model for metric learning.
At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space.
We evaluate the proposed model with six different formulations on four datasets.
arXiv Detail & Related papers (2022-03-21T09:48:23Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Graph Signal Restoration Using Nested Deep Algorithm Unrolling [85.53158261016331]
Graph signal processing is a ubiquitous task in many applications such as sensor, social transportation brain networks, point cloud processing, and graph networks.
We propose two restoration methods based on convexindependent deep ADMM (ADMM)
parameters in the proposed restoration methods are trainable in an end-to-end manner.
arXiv Detail & Related papers (2021-06-30T08:57:01Z) - Symmetric Spaces for Graph Embeddings: A Finsler-Riemannian Approach [7.752212921476838]
We propose the systematic use of symmetric spaces in representation learning, a class encompassing many of the previously used embedding targets.
We develop a tool to analyze the embeddings and infer structural properties of the data sets.
Our approach outperforms competitive baselines for graph reconstruction tasks on various synthetic and real-world datasets.
arXiv Detail & Related papers (2021-06-09T09:33:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.