Theoretical Foundations of t-SNE for Visualizing High-Dimensional
Clustered Data
- URL: http://arxiv.org/abs/2105.07536v2
- Date: Tue, 18 May 2021 01:04:30 GMT
- Title: Theoretical Foundations of t-SNE for Visualizing High-Dimensional
Clustered Data
- Authors: T. Tony Cai and Rong Ma
- Abstract summary: This study investigates the theoretical foundations of t-distributed dimension neighbor embedding (t-SNE)
A novel theoretical framework for the analysis of t-SNE based on gradient descent approach is presented.
The general theory explains the fast convergence rate and the exceptional empirical performance of t-SNE for visualizing clustered data.
- Score: 1.9199742103141069
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This study investigates the theoretical foundations of t-distributed
stochastic neighbor embedding (t-SNE), a popular nonlinear dimension reduction
and data visualization method. A novel theoretical framework for the analysis
of t-SNE based on the gradient descent approach is presented. For the early
exaggeration stage of t-SNE, we show its asymptotic equivalence to a power
iteration based on the underlying graph Laplacian, characterize its limiting
behavior, and uncover its deep connection to Laplacian spectral clustering, and
fundamental principles including early stopping as implicit regularization. The
results explain the intrinsic mechanism and the empirical benefits of such a
computational strategy. For the embedding stage of t-SNE, we characterize the
kinematics of the low-dimensional map throughout the iterations, and identify
an amplification phase, featuring the intercluster repulsion and the expansive
behavior of the low-dimensional map. The general theory explains the fast
convergence rate and the exceptional empirical performance of t-SNE for
visualizing clustered data, brings forth the interpretations of the t-SNE
output, and provides theoretical guidance for selecting tuning parameters in
various applications.
Related papers
- Hyperspectral Image Spectral-Spatial Feature Extraction via Tensor Principal Component Analysis [41.71615165526371]
This paper addresses the challenge of spectral-spatial feature extraction for hyperspectral image classification.
The proposed approach incorporates circular convolution into a tensor structure to effectively capture and integrate both spectral and spatial information.
Building upon this framework, the traditional Principal Component Analysis (PCA) technique is extended to its tensor-based counterpart, referred to as Principal Component Analysis (TPCA)
arXiv Detail & Related papers (2024-12-08T21:52:09Z) - Bridging Smoothness and Approximation: Theoretical Insights into Over-Smoothing in Graph Neural Networks [12.001676605529626]
We explore the approximation theory of functions defined on graphs.
We establish a framework to assess the lower bounds of approximation for target functions using Graph Convolutional Networks (GCNs)
We show how the high-frequency energy of the output decays, an indicator of over-smoothing, in GCNs.
arXiv Detail & Related papers (2024-07-01T13:35:53Z) - On the Generalization Capability of Temporal Graph Learning Algorithms:
Theoretical Insights and a Simpler Method [59.52204415829695]
Temporal Graph Learning (TGL) has become a prevalent technique across diverse real-world applications.
This paper investigates the generalization ability of different TGL algorithms.
We propose a simplified TGL network, which enjoys a small generalization error, improved overall performance, and lower model complexity.
arXiv Detail & Related papers (2024-02-26T08:22:22Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - PAC-Chernoff Bounds: Understanding Generalization in the Interpolation Regime [6.645111950779666]
This paper introduces a distribution-dependent PAC-Chernoff bound that exhibits perfect tightness for interpolators.
We present a unified theoretical framework revealing why certain interpolators show an exceptional generalization, while others falter.
arXiv Detail & Related papers (2023-06-19T14:07:10Z) - Multi-View Clustering via Semi-non-negative Tensor Factorization [120.87318230985653]
We develop a novel multi-view clustering based on semi-non-negative tensor factorization (Semi-NTF)
Our model directly considers the between-view relationship and exploits the between-view complementary information.
In addition, we provide an optimization algorithm for the proposed method and prove mathematically that the algorithm always converges to the stationary KKT point.
arXiv Detail & Related papers (2023-03-29T14:54:19Z) - Generalization Guarantee of Training Graph Convolutional Networks with
Graph Topology Sampling [83.77955213766896]
Graph convolutional networks (GCNs) have recently achieved great empirical success in learning graphstructured data.
To address its scalability issue, graph topology sampling has been proposed to reduce the memory and computational cost of training Gs.
This paper provides first theoretical justification of graph topology sampling in training (up to) three-layer GCNs.
arXiv Detail & Related papers (2022-07-07T21:25:55Z) - Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals.
We propose a general graph estimator based on a novel structured fusion regularization.
We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z) - Multi-View Spectral Clustering Tailored Tensor Low-Rank Representation [105.33409035876691]
This paper explores the problem of multi-view spectral clustering (MVSC) based on tensor low-rank modeling.
We design a novel structured tensor low-rank norm tailored to MVSC.
We show that the proposed method outperforms state-of-the-art methods to a significant extent.
arXiv Detail & Related papers (2020-04-30T11:52:12Z) - An Optimal Statistical and Computational Framework for Generalized
Tensor Estimation [10.899518267165666]
This paper describes a flexible framework for low-rank tensor estimation problems.
It includes many important instances from applications in computational imaging, genomics, and network analysis.
arXiv Detail & Related papers (2020-02-26T01:54:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.