Cluster and then Embed: A Modular Approach for Visualization
- URL: http://arxiv.org/abs/2509.03373v1
- Date: Wed, 27 Aug 2025 00:27:30 GMT
- Title: Cluster and then Embed: A Modular Approach for Visualization
- Authors: Elizabeth Coda, Ery Arias-Castro, Gal Mishne,
- Abstract summary: Dimensionality reduction methods such as t-SNE and UMAP are popular methods for visualizing data with a potential (latent) clustered structure.<n>We propose a more transparent, modular approach consisting of first clustering the data, then embedding each cluster, and finally aligning the clusters to obtain a global embedding.
- Score: 10.3849658049128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dimensionality reduction methods such as t-SNE and UMAP are popular methods for visualizing data with a potential (latent) clustered structure. They are known to group data points at the same time as they embed them, resulting in visualizations with well-separated clusters that preserve local information well. However, t-SNE and UMAP also tend to distort the global geometry of the underlying data. We propose a more transparent, modular approach consisting of first clustering the data, then embedding each cluster, and finally aligning the clusters to obtain a global embedding. We demonstrate this approach on several synthetic and real-world datasets and show that it is competitive with existing methods, while being much more transparent.
Related papers
- Wasserstein-Aligned Hyperbolic Multi-View Clustering [58.29261653100388]
This paper proposes a novel Wasserstein-Aligned Hyperbolic (WAH) framework for multi-view clustering.<n>Our method exploits a view-specific hyperbolic encoder for each view to embed features into the Lorentz manifold for hierarchical semantic modeling.
arXiv Detail & Related papers (2025-12-10T07:56:19Z) - Scalable Context-Preserving Model-Aware Deep Clustering for Hyperspectral Images [51.95768218975529]
Subspace clustering has become widely adopted for the unsupervised analysis of hyperspectral images (HSIs)<n>Recent model-aware deep subspace clustering methods often use a two-stage framework, involving the calculation of a self-representation matrix with complexity of O(n2), followed by spectral clustering.<n>We propose a scalable, context-preserving deep clustering method based on basis representation, which jointly captures local and non-local structures for efficient HSI clustering.
arXiv Detail & Related papers (2025-06-12T16:43:09Z) - Hierarchical clustering with maximum density paths and mixture models [44.443538161979056]
t-NEB is a probabilistically grounded hierarchical clustering method.<n>It yields state-of-the-art clustering performance on naturalistic high-dimensional data.
arXiv Detail & Related papers (2025-03-19T15:37:51Z) - ClusterGraph: a new tool for visualization and compression of multidimensional data [0.0]
This paper provides an additional layer on the output of any clustering algorithm.
It provides information about the global layout of clusters, obtained from the considered clustering algorithm.
arXiv Detail & Related papers (2024-11-08T09:40:54Z) - AugDMC: Data Augmentation Guided Deep Multiple Clustering [2.479720095773358]
AugDMC is a novel data Augmentation guided Deep Multiple Clustering method.
It exploits data augmentations to automatically extract features related to a certain aspect of the data.
A stable optimization strategy is proposed to alleviate the unstable problem from different augmentations.
arXiv Detail & Related papers (2023-06-22T16:31:46Z) - A Generalized Framework for Predictive Clustering and Optimization [18.06697544912383]
Clustering is a powerful and extensively used data science tool.
In this article, we define a generalized optimization framework for predictive clustering.
We also present a joint optimization strategy that exploits mixed-integer linear programming (MILP) for global optimization.
arXiv Detail & Related papers (2023-05-07T19:56:51Z) - Multi-View Clustering via Semi-non-negative Tensor Factorization [120.87318230985653]
We develop a novel multi-view clustering based on semi-non-negative tensor factorization (Semi-NTF)
Our model directly considers the between-view relationship and exploits the between-view complementary information.
In addition, we provide an optimization algorithm for the proposed method and prove mathematically that the algorithm always converges to the stationary KKT point.
arXiv Detail & Related papers (2023-03-29T14:54:19Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - Deep Clustering: A Comprehensive Survey [53.387957674512585]
Clustering analysis plays an indispensable role in machine learning and data mining.
Deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks.
Existing surveys for deep clustering mainly focus on the single-view fields and the network architectures, ignoring the complex application scenarios of clustering.
arXiv Detail & Related papers (2022-10-09T02:31:32Z) - Fine-grained Graph Learning for Multi-view Subspace Clustering [2.4094285826152593]
We propose a fine-grained graph learning framework for multi-view subspace clustering (FGL-MSC)
The main challenge is how to optimize the fine-grained fusion weights while generating the learned graph that fits the clustering task.
Experiments on eight real-world datasets show that the proposed framework has comparable performance to the state-of-the-art methods.
arXiv Detail & Related papers (2022-01-12T18:00:29Z) - Deep Conditional Gaussian Mixture Model for Constrained Clustering [7.070883800886882]
Constrained clustering can leverage prior information on a growing amount of only partially labeled data.
We propose a novel framework for constrained clustering that is intuitive, interpretable, and can be trained efficiently in the framework of gradient variational inference.
arXiv Detail & Related papers (2021-06-11T13:38:09Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Multi-view Subspace Clustering Networks with Local and Global Graph
Information [19.64977233324484]
The goal of this study is to explore the underlying grouping structure of data collected from different fields or measurements.
We propose the novel multi-view subspace clustering networks with local and global graph information, termed MSCNLG.
As an end-to-end trainable framework, the proposed method fully investigates the valuable information of multiple views.
arXiv Detail & Related papers (2020-10-19T09:00:19Z) - Unsupervised Multi-view Clustering by Squeezing Hybrid Knowledge from
Cross View and Each View [68.88732535086338]
This paper proposes a new multi-view clustering method, low-rank subspace multi-view clustering based on adaptive graph regularization.
Experimental results for five widely used multi-view benchmarks show that our proposed algorithm surpasses other state-of-the-art methods by a clear margin.
arXiv Detail & Related papers (2020-08-23T08:25:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.