Related papers: Benchmarking Dimensionality Reduction Techniques for Spatial Transcriptomics

Benchmarking Dimensionality Reduction Techniques for Spatial Transcriptomics

URL: http://arxiv.org/abs/2509.13344v1
Date: Fri, 12 Sep 2025 17:27:34 GMT
Title: Benchmarking Dimensionality Reduction Techniques for Spatial Transcriptomics
Authors: Md Ishtyaq Mahmud, Veena Kochat, Suresh Satpati, Jagan Mohan Reddy Dwarampudi, Kunal Rai, Tania Banerjee,
Abstract summary: We introduce a unified framework for evaluating dimensionality reduction techniques in spatial transcriptomics.<n>We benchmark six methods PCA, NMF, autoencoder, VAE, and two hybrid embeddings on a cholangiocarcinoma Xenium dataset.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a unified framework for evaluating dimensionality reduction techniques in spatial transcriptomics beyond standard PCA approaches. We benchmark six methods PCA, NMF, autoencoder, VAE, and two hybrid embeddings on a cholangiocarcinoma Xenium dataset, systematically varying latent dimensions ($k$=5-40) and clustering resolutions ($\rho$=0.1-1.2). Each configuration is evaluated using complementary metrics including reconstruction error, explained variance, cluster cohesion, and two novel biologically-motivated measures: Cluster Marker Coherence (CMC) and Marker Exclusion Rate (MER). Our results demonstrate distinct performance profiles: PCA provides a fast baseline, NMF maximizes marker enrichment, VAE balances reconstruction and interpretability, while autoencoders occupy a middle ground. We provide systematic hyperparameter selection using Pareto optimal analysis and demonstrate how MER-guided reassignment improves biological fidelity across all methods, with CMC scores improving by up to 12\% on average. This framework enables principled selection of dimensionality reduction methods tailored to specific spatial transcriptomics analyses.

Related papers

MS-ISSM: Objective Quality Assessment of Point Clouds Using Multi-scale Implicit Structural Similarity [65.85858856481131]
unstructured and irregular nature of point clouds poses a significant challenge for objective quality assessment (PCQA)<n>We propose the Multi-scale Implicit Structural Similarity Measurement (MS-ISSM)
arXiv Detail & Related papers (2026-01-03T14:58:52Z)
Learning Discrete Bayesian Networks with Hierarchical Dirichlet Shrinkage [52.914168158222765]
We detail a comprehensive Bayesian framework for learning DBNs.<n>We give a novel Markov chain Monte Carlo (MCMC) algorithm utilizing parallel Langevin proposals to generate exact posterior samples.<n>We apply our methodology to uncover prognostic network structure from primary breast cancer samples.
arXiv Detail & Related papers (2025-09-16T17:24:35Z)
An Interpretable Ensemble Framework for Multi-Omics Dementia Biomarker Discovery Under HDLSS Conditions [0.0]
We propose a novel ensemble approach combining Graph Attention Networks (GAT), MultiOmics Variational AutoEncoder (MOVE), Elastic-net sparse regression, and Storey's False Discovery Rate (FDR)<n>We evaluate performance using both simulated multi-omics data and the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset.<n>Our method demonstrates superior predictive accuracy, feature selection precision, and biological relevance.
arXiv Detail & Related papers (2025-09-04T15:20:13Z)
Parameter-free entropy-regularized multi-view clustering with hierarchical feature selection [3.8015092217142237]
This work introduces two complementary algorithms: AMVFCM-U and AAMVFCM-U, providing a unified parameter-free framework.<n>AAMVFCM-U achieves up to 97% computational efficiency gains, reduces dimensionality to 0.45% of original size, and automatically identifies critical view combinations.
arXiv Detail & Related papers (2025-08-07T15:36:59Z)
M-learner:A Flexible And Powerful Framework To Study Heterogeneous Treatment Effect In Mediation Model [11.977166290154125]
We propose a novel method, termed the M-learner, for estimating heterogeneous indirect and total treatment effects.<n>To the best of our knowledge, this is the first approach specifically designed to capture treatment effect heterogeneity in the presence of mediation.
arXiv Detail & Related papers (2025-05-23T13:57:23Z)
Enhanced High-Dimensional Data Visualization through Adaptive Multi-Scale Manifold Embedding [0.7705234721762716]
We propose an Adaptive Multi-Scale Manifold Embedding (AMSME) algorithm.<n>By introducing ordinal distance, we demonstrate that ordinal distance overcomes the constraints of the curse of dimensionality in high-dimensional spaces.<n> Experimental results demonstrate that AMSME significantly preserves intra-cluster topological structures and improves inter-cluster separation on real-world datasets.
arXiv Detail & Related papers (2025-03-18T06:46:53Z)
Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification [72.77513633290056]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model. Our method captures intricate patterns and relationships, enhancing classification performance.
arXiv Detail & Related papers (2024-02-14T16:10:42Z)
Unsupervised machine learning framework for discriminating major variants of concern during COVID-19 [1.5346017713894948]
The COVID-19 pandemic evolved rapidly due to the high mutation rate of the virus. Certain variants of the virus, such as Delta and Omicron, emerged with altered viral properties leading to severe transmission and death rates. Unsupervised machine learning methods have the ability to compress, characterize, and visualize unlabelled data.
arXiv Detail & Related papers (2022-08-01T13:02:28Z)
Improving Classification Model Performance on Chest X-Rays through Lung Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations. Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z)
Reinforcement Learning with Heterogeneous Data: Estimation and Inference [84.72174994749305]
We introduce the K-Heterogeneous Markov Decision Process (K-Hetero MDP) to address sequential decision problems with population heterogeneity. We propose the Auto-Clustered Policy Evaluation (ACPE) for estimating the value of a given policy, and the Auto-Clustered Policy Iteration (ACPI) for estimating the optimal policy in a given policy class. We present simulations to support our theoretical findings, and we conduct an empirical study on the standard MIMIC-III dataset.
arXiv Detail & Related papers (2022-01-31T20:58:47Z)
Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E) We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z)
Exact and Approximation Algorithms for Sparse PCA [1.7640556247739623]
This paper proposes two exact mixed-integer SDPs (MISDPs) We then analyze the theoretical optimality gaps of their continuous relaxation values and prove that they are stronger than that of the state-of-art one. Since off-the-shelf solvers, in general, have difficulty in solving MISDPs, we approximate SPCA with arbitrary accuracy by a mixed-integer linear program (MILP) of a similar size as MISDPs.
arXiv Detail & Related papers (2020-08-28T02:07:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.