Related papers: Enhanced High-Dimensional Data Visualization through Adaptive Multi-Scale Manifold Embedding

Enhanced High-Dimensional Data Visualization through Adaptive Multi-Scale Manifold Embedding

URL: http://arxiv.org/abs/2503.13954v2
Date: Wed, 19 Mar 2025 05:21:06 GMT
Title: Enhanced High-Dimensional Data Visualization through Adaptive Multi-Scale Manifold Embedding
Authors: Tianhao Ni, Bingjie Li, Zhigang Yao,
Abstract summary: We propose an Adaptive Multi-Scale Manifold Embedding (AMSME) algorithm.<n>By introducing ordinal distance, we demonstrate that ordinal distance overcomes the constraints of the curse of dimensionality in high-dimensional spaces.<n> Experimental results demonstrate that AMSME significantly preserves intra-cluster topological structures and improves inter-cluster separation on real-world datasets.
Score: 0.7705234721762716
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To address the dual challenges of the curse of dimensionality and the difficulty in separating intra-cluster and inter-cluster structures in high-dimensional manifold embedding, we proposes an Adaptive Multi-Scale Manifold Embedding (AMSME) algorithm. By introducing ordinal distance to replace traditional Euclidean distances, we theoretically demonstrate that ordinal distance overcomes the constraints of the curse of dimensionality in high-dimensional spaces, effectively distinguishing heterogeneous samples. We design an adaptive neighborhood adjustment method to construct similarity graphs that simultaneously balance intra-cluster compactness and inter-cluster separability. Furthermore, we develop a two-stage embedding framework: the first stage achieves preliminary cluster separation while preserving connectivity between structurally similar clusters via the similarity graph, and the second stage enhances inter-cluster separation through a label-driven distance reweighting. Experimental results demonstrate that AMSME significantly preserves intra-cluster topological structures and improves inter-cluster separation on real-world datasets. Additionally, leveraging its multi-resolution analysis capability, AMSME discovers novel neuronal subtypes in the mouse lumbar dorsal root ganglion scRNA-seq dataset, with marker gene analysis revealing their distinct biological roles.

Related papers

Controllable diffusion-based generation for multi-channel biological data [66.44042377817074]
This work proposes a unified diffusion framework for controllable generation over structured and spatial biological data.<n>We show state-of-the-art performance across both spatial and non-spatial prediction tasks, including protein imputation in IMC and gene-to-protein prediction in single-cell datasets.
arXiv Detail & Related papers (2025-06-24T00:56:21Z)
Scalable Robust Bayesian Co-Clustering with Compositional ELBOs [2.6756996523251964]
Co-clustering exploits the duality of instances and features to uncover meaningful groups in both dimensions. We present the first fully variational Co-clustering framework that directly learns row and column clusters in the latent space. Our method not only preserves the advantages of prior Co-clustering approaches but also exceeds them in accuracy and robustness.
arXiv Detail & Related papers (2025-04-05T06:48:05Z)
Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z)
Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets. In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem. This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z)
Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters. Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level. Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z)
Multi-View Clustering via Semi-non-negative Tensor Factorization [120.87318230985653]
We develop a novel multi-view clustering based on semi-non-negative tensor factorization (Semi-NTF) Our model directly considers the between-view relationship and exploits the between-view complementary information. In addition, we provide an optimization algorithm for the proposed method and prove mathematically that the algorithm always converges to the stationary KKT point.
arXiv Detail & Related papers (2023-03-29T14:54:19Z)
Subspace-Contrastive Multi-View Clustering [0.0]
We propose a novel Subspace-Contrastive Multi-View Clustering (SCMC) approach. We employ view-specific auto-encoders to map the original multi-view data into compact features perceiving its nonlinear structures. To demonstrate the effectiveness of the proposed model, we conduct a large number of comparative experiments on eight challenge datasets.
arXiv Detail & Related papers (2022-10-13T07:19:37Z)
Mixed Graph Contrastive Network for Semi-Supervised Node Classification [63.924129159538076]
We propose a novel graph contrastive learning method, termed Mixed Graph Contrastive Network (MGCN)<n>In our method, we improve the discriminative capability of the latent embeddings by an unperturbed augmentation strategy and a correlation reduction mechanism.<n>By combining the two settings, we extract rich supervision information from both the abundant nodes and the rare yet valuable labeled nodes for discriminative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z)
Improved Dual Correlation Reduction Network [40.792587861237166]
We propose a novel deep graph clustering algorithm termed Improved Dual Correlation Reduction Network (IDCRN) By approximating the cross-view feature correlation matrix to an identity matrix, we reduce the redundancy between different dimensions of features. We also avoid the collapsed representation caused by the over-smoothing issue in Graph Convolutional Networks (GCNs) through an introduced propagation regularization term.
arXiv Detail & Related papers (2022-02-25T07:48:32Z)
Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains. We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z)
Neural Distance Embeddings for Biological Sequences [43.07977514121458]
We present NeuroSEED, a framework to embed sequences in geometric vector spaces. We show the effectiveness of the hyperbolic space that captures the hierarchical structure and provides an average 22% reduction in embedding RMSE. The proposed approaches display significant accuracy and/or runtime improvements on real-world datasets.
arXiv Detail & Related papers (2021-09-20T17:30:58Z)
LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset. Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation. We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.