Understanding High Dimensional Spaces through Visual Means Employing
Multidimensional Projections
- URL: http://arxiv.org/abs/2207.10800v1
- Date: Tue, 12 Jul 2022 20:30:33 GMT
- Title: Understanding High Dimensional Spaces through Visual Means Employing
Multidimensional Projections
- Authors: Haseeb Younis, Paul Trust, Rosane Minghim
- Abstract summary: Two of the relevant algorithms in the data visualisation field are t-distributed neighbourhood embedding (t-SNE) and Least-Square Projection (LSP)
These algorithms can be used to understand several ranges of mathematical functions including their impact on datasets.
We illustrate ways of employing the visual results of multidimensional projection algorithms to understand and fine-tune the parameters of their mathematical framework.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data visualisation helps understanding data represented by multiple
variables, also called features, stored in a large matrix where individuals are
stored in lines and variable values in columns. These data structures are
frequently called multidimensional spaces.In this paper, we illustrate ways of
employing the visual results of multidimensional projection algorithms to
understand and fine-tune the parameters of their mathematical framework. Some
of the common mathematical common to these approaches are Laplacian matrices,
Euclidian distance, Cosine distance, and statistical methods such as
Kullback-Leibler divergence, employed to fit probability distributions and
reduce dimensions. Two of the relevant algorithms in the data visualisation
field are t-distributed stochastic neighbourhood embedding (t-SNE) and
Least-Square Projection (LSP). These algorithms can be used to understand
several ranges of mathematical functions including their impact on datasets. In
this article, mathematical parameters of underlying techniques such as
Principal Component Analysis (PCA) behind t-SNE and mesh reconstruction methods
behind LSP are adjusted to reflect the properties afforded by the mathematical
formulation. The results, supported by illustrative methods of the processes of
LSP and t-SNE, are meant to inspire students in understanding the mathematics
behind such methods, in order to apply them in effective data analysis tasks in
multiple applications.
Related papers
- Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Enhancing Deep Learning Models through Tensorization: A Comprehensive
Survey and Framework [0.0]
This paper explores the steps involved in multidimensional data sources, various multiway analysis methods employed, and the benefits of these approaches.
A small example of Blind Source Separation (BSS) is presented comparing 2-dimensional algorithms and a multiway algorithm in Python.
Results indicate that multiway analysis is more expressive.
arXiv Detail & Related papers (2023-09-05T17:56:22Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Geometry of EM and related iterative algorithms [8.228889210180268]
The Expectation--Maximization (EM) algorithm is a simple meta-algorithm that has been used for many years as a methodology for statistical inference.
In this paper, we introduce the $em$ algorithm, an information geometric formulation of the EM algorithm, and its extensions and applications to various problems.
arXiv Detail & Related papers (2022-09-03T00:23:23Z) - Laplacian-based Cluster-Contractive t-SNE for High Dimensional Data
Visualization [20.43471678277403]
We propose LaptSNE, a new graph-based dimensionality reduction method based on t-SNE.
Specifically, LaptSNE leverages the eigenvalue information of the graph Laplacian to shrink the potential clusters in the low-dimensional embedding.
We show how to calculate the gradient analytically, which may be of broad interest when considering optimization with Laplacian-composited objective.
arXiv Detail & Related papers (2022-07-25T14:10:24Z) - CCP: Correlated Clustering and Projection for Dimensionality Reduction [5.992724190105578]
Correlated Clustering and Projection offers a novel data domain strategy that does not need to solve any matrix.
CCP partitions high-dimensional features into correlated clusters and then projects correlated features in each cluster into a one-dimensional representation.
Proposed methods are validated with benchmark datasets associated with various machine learning algorithms.
arXiv Detail & Related papers (2022-06-08T23:14:44Z) - UnProjection: Leveraging Inverse-Projections for Visual Analytics of
High-Dimensional Data [63.74032987144699]
We present NNInv, a deep learning technique with the ability to approximate the inverse of any projection or mapping.
NNInv learns to reconstruct high-dimensional data from any arbitrary point on a 2D projection space, giving users the ability to interact with the learned high-dimensional representation in a visual analytics system.
arXiv Detail & Related papers (2021-11-02T17:11:57Z) - Learning Log-Determinant Divergences for Positive Definite Matrices [47.61701711840848]
In this paper, we propose to learn similarity measures in a data-driven manner.
We capitalize on the alphabeta-log-det divergence, which is a meta-divergence parametrized by scalars alpha and beta.
Our key idea is to cast these parameters in a continuum and learn them from data.
arXiv Detail & Related papers (2021-04-13T19:09:43Z) - Probabilistic Learning Vector Quantization on Manifold of Symmetric
Positive Definite Matrices [3.727361969017079]
We develop a new classification method for manifold-valued data in the framework of probabilistic learning vector quantization.
In this paper, we generalize the probabilistic learning vector quantization algorithm for data points living on the manifold of symmetric positive definite matrices.
Empirical investigations on synthetic data, image data, and motor imagery EEG data demonstrate the superior performance of the proposed method.
arXiv Detail & Related papers (2021-02-01T06:58:39Z) - Two-Dimensional Semi-Nonnegative Matrix Factorization for Clustering [50.43424130281065]
We propose a new Semi-Nonnegative Matrix Factorization method for 2-dimensional (2D) data, named TS-NMF.
It overcomes the drawback of existing methods that seriously damage the spatial information of the data by converting 2D data to vectors in a preprocessing step.
arXiv Detail & Related papers (2020-05-19T05:54:14Z) - Spatial Pyramid Based Graph Reasoning for Semantic Segmentation [67.47159595239798]
We apply graph convolution into the semantic segmentation task and propose an improved Laplacian.
The graph reasoning is directly performed in the original feature space organized as a spatial pyramid.
We achieve comparable performance with advantages in computational and memory overhead.
arXiv Detail & Related papers (2020-03-23T12:28:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.