SNEkhorn: Dimension Reduction with Symmetric Entropic Affinities
- URL: http://arxiv.org/abs/2305.13797v2
- Date: Mon, 30 Oct 2023 10:34:03 GMT
- Title: SNEkhorn: Dimension Reduction with Symmetric Entropic Affinities
- Authors: Hugues Van Assel, Titouan Vayer, R\'emi Flamary, Nicolas Courty
- Abstract summary: Entropic affinities (EAs) are used in the popular Dimensionality Reduction (DR) algorithm t-SNE.
EAs are inherently asymmetric and row-wise, but they are used in DR approaches after undergoing symmetrization methods.
In this work, we uncover a novel characterization of EA as an optimal transport problem, allowing a natural symmetrization that can be computed efficiently.
- Score: 14.919246099820548
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many approaches in machine learning rely on a weighted graph to encode the
similarities between samples in a dataset. Entropic affinities (EAs), which are
notably used in the popular Dimensionality Reduction (DR) algorithm t-SNE, are
particular instances of such graphs. To ensure robustness to heterogeneous
sampling densities, EAs assign a kernel bandwidth parameter to every sample in
such a way that the entropy of each row in the affinity matrix is kept constant
at a specific value, whose exponential is known as perplexity. EAs are
inherently asymmetric and row-wise stochastic, but they are used in DR
approaches after undergoing heuristic symmetrization methods that violate both
the row-wise constant entropy and stochasticity properties. In this work, we
uncover a novel characterization of EA as an optimal transport problem,
allowing a natural symmetrization that can be computed efficiently using dual
ascent. The corresponding novel affinity matrix derives advantages from
symmetric doubly stochastic normalization in terms of clustering performance,
while also effectively controlling the entropy of each row thus making it
particularly robust to varying noise levels. Following, we present a new DR
algorithm, SNEkhorn, that leverages this new affinity matrix. We show its clear
superiority to state-of-the-art approaches with several indicators on both
synthetic and real-world datasets.
Related papers
- Accelerated Discovery of Machine-Learned Symmetries: Deriving the
Exceptional Lie Groups G2, F4 and E6 [55.41644538483948]
This letter introduces two improved algorithms that significantly speed up the discovery of symmetry transformations.
Given the significant complexity of the exceptional Lie groups, our results demonstrate that this machine-learning method for discovering symmetries is completely general and can be applied to a wide variety of labeled datasets.
arXiv Detail & Related papers (2023-07-10T20:25:44Z) - Nonlinear SVD with Asymmetric Kernels: feature learning and asymmetric
Nystr\"om method [14.470859959783995]
Asymmetric data naturally exist in real life, such as directed graphs.
This paper tackles the asymmetric kernel-based learning problem.
Experiments show that asymmetric KSVD learns features outperforming Mercer- Kernel.
arXiv Detail & Related papers (2023-06-12T11:39:34Z) - Stochastic parameter optimization analysis of dynamical quantum critical phenomena in long-range transverse-field Ising chain [0.0]
We explore the quantum phase transition of the one-dimensional long-range transverse-field Ising model.
In our simulations, the simulator automatically determines the parameters to sample from, even without prior knowledge of the critical point and universality class.
We successfully obtained numerical evidence supporting $sigma = 7/4$ as the universality boundary between the latter two.
arXiv Detail & Related papers (2023-05-23T14:46:16Z) - Oracle-Preserving Latent Flows [58.720142291102135]
We develop a methodology for the simultaneous discovery of multiple nontrivial continuous symmetries across an entire labelled dataset.
The symmetry transformations and the corresponding generators are modeled with fully connected neural networks trained with a specially constructed loss function.
The two new elements in this work are the use of a reduced-dimensionality latent space and the generalization to transformations invariant with respect to high-dimensional oracles.
arXiv Detail & Related papers (2023-02-02T00:13:32Z) - On Stochastic Moving-Average Estimators for Non-Convex Optimization [105.22760323075008]
In this paper, we demonstrate the power of a widely used estimator based on moving average (SEMA) problems.
For all these-the-art results, we also present the results for all these-the-art problems.
arXiv Detail & Related papers (2021-04-30T08:50:24Z) - Graph Gamma Process Generalized Linear Dynamical Systems [60.467040479276704]
We introduce graph gamma process (GGP) linear dynamical systems to model real multivariate time series.
For temporal pattern discovery, the latent representation under the model is used to decompose the time series into a parsimonious set of multivariate sub-sequences.
We use the generated random graph, whose number of nonzero-degree nodes is finite, to define both the sparsity pattern and dimension of the latent state transition matrix.
arXiv Detail & Related papers (2020-07-25T04:16:34Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - The Heavy-Tail Phenomenon in SGD [7.366405857677226]
We show that depending on the structure of the Hessian of the loss at the minimum, the SGD iterates will converge to a emphheavy-tailed stationary distribution.
We translate our results into insights about the behavior of SGD in deep learning.
arXiv Detail & Related papers (2020-06-08T16:43:56Z) - Doubly-Stochastic Normalization of the Gaussian Kernel is Robust to
Heteroskedastic Noise [3.5429774642987915]
We show that the doubly-stochastic normalization of the Gaussian kernel with zero main diagonal is robust to heteroskedastic noise.
We provide examples of simulated and experimental single-cell RNA sequence data with intrinsic heteroskedasticity.
arXiv Detail & Related papers (2020-05-31T01:31:10Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.