Related papers: Theoretical Foundations of t-SNE for Visualizing High-Dimensional Clustered Data

Theoretical Foundations of t-SNE for Visualizing High-Dimensional Clustered Data

URL: http://arxiv.org/abs/2105.07536v2
Date: Tue, 18 May 2021 01:04:30 GMT
Title: Theoretical Foundations of t-SNE for Visualizing High-Dimensional Clustered Data
Authors: T. Tony Cai and Rong Ma
Abstract summary: This study investigates the theoretical foundations of t-distributed dimension neighbor embedding (t-SNE) A novel theoretical framework for the analysis of t-SNE based on gradient descent approach is presented. The general theory explains the fast convergence rate and the exceptional empirical performance of t-SNE for visualizing clustered data.
Score: 1.9199742103141069
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This study investigates the theoretical foundations of t-distributed stochastic neighbor embedding (t-SNE), a popular nonlinear dimension reduction and data visualization method. A novel theoretical framework for the analysis of t-SNE based on the gradient descent approach is presented. For the early exaggeration stage of t-SNE, we show its asymptotic equivalence to a power iteration based on the underlying graph Laplacian, characterize its limiting behavior, and uncover its deep connection to Laplacian spectral clustering, and fundamental principles including early stopping as implicit regularization. The results explain the intrinsic mechanism and the empirical benefits of such a computational strategy. For the embedding stage of t-SNE, we characterize the kinematics of the low-dimensional map throughout the iterations, and identify an amplification phase, featuring the intercluster repulsion and the expansive behavior of the low-dimensional map. The general theory explains the fast convergence rate and the exceptional empirical performance of t-SNE for visualizing clustered data, brings forth the interpretations of the t-SNE output, and provides theoretical guidance for selecting tuning parameters in various applications.

Related papers

Low-Rank Implicit Neural Representation via Schatten-p Quasi-Norm and Jacobian Regularization [49.158601255093416]
We propose a CP-based low-rank tensor function parameterized by neural networks for implicit neural representation.<n>For smoothness, we propose a regularization term based on the spectral norm of the Jacobian and Hutchinson's trace estimator.<n>Our proposed smoothness regularization is SVD-free and avoids explicit chain rule derivations.
arXiv Detail & Related papers (2025-06-27T11:23:10Z)
Hyperspectral Image Spectral-Spatial Feature Extraction via Tensor Principal Component Analysis [41.71615165526371]
This paper addresses the challenge of spectral-spatial feature extraction for hyperspectral image classification. The proposed approach incorporates circular convolution into a tensor structure to effectively capture and integrate both spectral and spatial information. Building upon this framework, the traditional Principal Component Analysis (PCA) technique is extended to its tensor-based counterpart, referred to as Principal Component Analysis (TPCA)
arXiv Detail & Related papers (2024-12-08T21:52:09Z)
Bridging Smoothness and Approximation: Theoretical Insights into Over-Smoothing in Graph Neural Networks [12.001676605529626]
We explore the approximation theory of functions defined on graphs. We establish a framework to assess the lower bounds of approximation for target functions using Graph Convolutional Networks (GCNs) We show how the high-frequency energy of the output decays, an indicator of over-smoothing, in GCNs.
arXiv Detail & Related papers (2024-07-01T13:35:53Z)
Spectral Clustering for Directed Graphs via Likelihood Estimation on Stochastic Block Models [22.421702511126373]
We leverage statistical inference on block models to guide the development of a spectral clustering algorithm for directed graphs.<n>We establish a theoretical upper bound on the misclustering error of its spectral relaxation, and based on this relaxation, introduce a novel, self-adaptive spectral clustering method for directed graphs.
arXiv Detail & Related papers (2024-03-28T15:47:13Z)
On the Generalization Capability of Temporal Graph Learning Algorithms: Theoretical Insights and a Simpler Method [59.52204415829695]
Temporal Graph Learning (TGL) has become a prevalent technique across diverse real-world applications. This paper investigates the generalization ability of different TGL algorithms. We propose a simplified TGL network, which enjoys a small generalization error, improved overall performance, and lower model complexity.
arXiv Detail & Related papers (2024-02-26T08:22:22Z)
Revealing Decurve Flows for Generalized Graph Propagation [108.80758541147418]
This study addresses the limitations of the traditional analysis of message-passing, central to graph learning, by defining em textbfgeneralized propagation with directed and weighted graphs. We include a preliminary exploration of learned propagation patterns in datasets, a first in the field.
arXiv Detail & Related papers (2024-02-13T14:13:17Z)
A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime. We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z)
PAC-Chernoff Bounds: Understanding Generalization in the Interpolation Regime [6.645111950779666]
This paper introduces a distribution-dependent PAC-Chernoff bound that exhibits perfect tightness for interpolators. We present a unified theoretical framework revealing why certain interpolators show an exceptional generalization, while others falter.
arXiv Detail & Related papers (2023-06-19T14:07:10Z)
Multi-View Clustering via Semi-non-negative Tensor Factorization [120.87318230985653]
We develop a novel multi-view clustering based on semi-non-negative tensor factorization (Semi-NTF) Our model directly considers the between-view relationship and exploits the between-view complementary information. In addition, we provide an optimization algorithm for the proposed method and prove mathematically that the algorithm always converges to the stationary KKT point.
arXiv Detail & Related papers (2023-03-29T14:54:19Z)
Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling [83.77955213766896]
Graph convolutional networks (GCNs) have recently achieved great empirical success in learning graphstructured data. To address its scalability issue, graph topology sampling has been proposed to reduce the memory and computational cost of training Gs. This paper provides first theoretical justification of graph topology sampling in training (up to) three-layer GCNs.
arXiv Detail & Related papers (2022-07-07T21:25:55Z)
Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals. We propose a general graph estimator based on a novel structured fusion regularization. We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z)
Multi-View Spectral Clustering Tailored Tensor Low-Rank Representation [105.33409035876691]
This paper explores the problem of multi-view spectral clustering (MVSC) based on tensor low-rank modeling. We design a novel structured tensor low-rank norm tailored to MVSC. We show that the proposed method outperforms state-of-the-art methods to a significant extent.
arXiv Detail & Related papers (2020-04-30T11:52:12Z)
An Optimal Statistical and Computational Framework for Generalized Tensor Estimation [10.899518267165666]
This paper describes a flexible framework for low-rank tensor estimation problems. It includes many important instances from applications in computational imaging, genomics, and network analysis.
arXiv Detail & Related papers (2020-02-26T01:54:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.