Dimension Reduction with Locally Adjusted Graphs
- URL: http://arxiv.org/abs/2412.15426v1
- Date: Thu, 19 Dec 2024 22:21:39 GMT
- Title: Dimension Reduction with Locally Adjusted Graphs
- Authors: Yingfan Wang, Yiyang Sun, Haiyang Huang, Cynthia Rudin,
- Abstract summary: LocalMAP is a dimensionality reduction algorithm that dynamically and locally adjusts the graph to address this challenge.
We demonstrate the benefits of LocalMAP through a case study on biological datasets.
- Score: 20.06041784058165
- License:
- Abstract: Dimension reduction (DR) algorithms have proven to be extremely useful for gaining insight into large-scale high-dimensional datasets, particularly finding clusters in transcriptomic data. The initial phase of these DR methods often involves converting the original high-dimensional data into a graph. In this graph, each edge represents the similarity or dissimilarity between pairs of data points. However, this graph is frequently suboptimal due to unreliable high-dimensional distances and the limited information extracted from the high-dimensional data. This problem is exacerbated as the dataset size increases. If we reduce the size of the dataset by selecting points for a specific sections of the embeddings, the clusters observed through DR are more separable since the extracted subgraphs are more reliable. In this paper, we introduce LocalMAP, a new dimensionality reduction algorithm that dynamically and locally adjusts the graph to address this challenge. By dynamically extracting subgraphs and updating the graph on-the-fly, LocalMAP is capable of identifying and separating real clusters within the data that other DR methods may overlook or combine. We demonstrate the benefits of LocalMAP through a case study on biological datasets, highlighting its utility in helping users more accurately identify clusters for real-world problems.
Related papers
- Noncommutative Model Selection for Data Clustering and Dimension Reduction Using Relative von Neumann Entropy [0.0]
We propose a pair of data-driven algorithms for unsupervised classification and dimension reduction.
In our experiments, our clustering algorithm outperforms $k$-means clustering on data sets with non-trivial geometry and topology.
arXiv Detail & Related papers (2024-11-29T18:04:11Z) - Locally Regularized Sparse Graph by Fast Proximal Gradient Descent [6.882546996728011]
We propose a novel Regularized Sparse Graph abbreviated SRSG.
Sparse graphs have been shown to be effective in clustering high-dimensional data.
We show that SRSG is superior to other clustering methods.
arXiv Detail & Related papers (2024-09-25T16:57:47Z) - ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly.
equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset.
Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding [51.75091298017941]
This paper proposes a novel Deep Manifold (Variational) Graph Auto-Encoder (DMVGAE/DMGAE) for attributed graph data.
The proposed method surpasses state-of-the-art baseline algorithms by a significant margin on different downstream tasks across popular datasets.
arXiv Detail & Related papers (2024-01-12T17:57:07Z) - Visual Cluster Separation Using High-Dimensional Sharpened
Dimensionality Reduction [65.80631307271705]
High-Dimensional Sharpened DR' (HD-SDR) is tested on both synthetic and real-world data sets.
Our method achieves good quality (measured by quality metrics) and scales computationally well with large high-dimensional data.
To illustrate its concrete applications, we further apply HD-SDR on a recent astronomical catalog.
arXiv Detail & Related papers (2021-10-01T11:13:51Z) - Measuring inter-cluster similarities with Alpha Shape TRIangulation in
loCal Subspaces (ASTRICS) facilitates visualization and clustering of
high-dimensional data [0.0]
Clustering and visualizing high-dimensional (HD) data are important tasks in a variety of fields.
Some of the most effective algorithms for clustering HD data are based on representing the data by nodes in a graph.
I propose a new method called ASTRICS to measure similarity between clusters of HD data points.
arXiv Detail & Related papers (2021-07-15T20:51:06Z) - Spatial-Spectral Clustering with Anchor Graph for Hyperspectral Image [88.60285937702304]
This paper proposes a novel unsupervised approach called spatial-spectral clustering with anchor graph (SSCAG) for HSI data clustering.
The proposed SSCAG is competitive against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-04-24T08:09:27Z) - A Local Similarity-Preserving Framework for Nonlinear Dimensionality
Reduction with Neural Networks [56.068488417457935]
We propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction.
To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points.
Experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test.
arXiv Detail & Related papers (2021-03-10T23:10:47Z) - Dimensionality Reduction via Diffusion Map Improved with Supervised
Linear Projection [1.7513645771137178]
In this paper, we assume the data samples lie on a single underlying smooth manifold.
We define intra-class and inter-class similarities using pairwise local kernel distances.
We aim to find a linear projection to maximize the intra-class similarities and minimize the inter-class similarities simultaneously.
arXiv Detail & Related papers (2020-08-08T04:26:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.