Sparser2Sparse: Single-shot Sparser-to-Sparse Learning for Spatial Transcriptomics Imputation with Natural Image Co-learning
- URL: http://arxiv.org/abs/2507.16886v1
- Date: Tue, 22 Jul 2025 17:58:38 GMT
- Title: Sparser2Sparse: Single-shot Sparser-to-Sparse Learning for Spatial Transcriptomics Imputation with Natural Image Co-learning
- Authors: Yaoyu Fang, Jiahe Qian, Xinkun Wang, Lee A. Cooper, Bo Zhou,
- Abstract summary: We present Single-shot Sparser-to-Sparse (S2S-ST), a novel framework for accurate spatial transcriptomics imputation.<n>Our framework requires only a single and low-cost sparsely sampled ST dataset alongside widely available natural images for co-training.
- Score: 1.7603474309877931
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spatial transcriptomics (ST) has revolutionized biomedical research by enabling high resolution gene expression profiling within tissues. However, the high cost and scarcity of high resolution ST data remain significant challenges. We present Single-shot Sparser-to-Sparse (S2S-ST), a novel framework for accurate ST imputation that requires only a single and low-cost sparsely sampled ST dataset alongside widely available natural images for co-training. Our approach integrates three key innovations: (1) a sparser-to-sparse self-supervised learning strategy that leverages intrinsic spatial patterns in ST data, (2) cross-domain co-learning with natural images to enhance feature representation, and (3) a Cascaded Data Consistent Imputation Network (CDCIN) that iteratively refines predictions while preserving sampled gene data fidelity. Extensive experiments on diverse tissue types, including breast cancer, liver, and lymphoid tissue, demonstrate that our method outperforms state-of-the-art approaches in imputation accuracy. By enabling robust ST reconstruction from sparse inputs, our framework significantly reduces reliance on costly high resolution data, facilitating potential broader adoption in biomedical research and clinical applications.
Related papers
- CoCoLIT: ControlNet-Conditioned Latent Image Translation for MRI to Amyloid PET Synthesis [2.333160549379721]
High dimensionality and structural complexity of 3D neuroimaging data pose challenges for MRI-to-PET translation.<n>We present CoCoLIT, a diffusion-based latent generative framework that incorporates three main innovations.<n>We evaluate CoCoLIT's performance on publicly available datasets and find that our model significantly outperforms state-of-the-art methods on both image-based and amyloid-related metrics.
arXiv Detail & Related papers (2025-08-02T09:58:30Z) - SPADE: Spatial Transcriptomics and Pathology Alignment Using a Mixture of Data Experts for an Expressive Latent Space [8.576634724266228]
We introduce SPADE, a foundation model that integrates histopathology with spatial transcriptomics data to guide image representation learning.<n> SPADE is evaluated on 14 downstream tasks, demonstrating significantly superior few-shot performance compared to baseline models.
arXiv Detail & Related papers (2025-06-27T02:20:51Z) - SemanticST: Spatially Informed Semantic Graph Learning for Clustering, Integration, and Scalable Analysis of Spatial Transcriptomics [3.1403380447856426]
We present SemanticST, a graph-based deep learning framework for spatial transcriptomics analysis.<n>It supports mini-batch training, making it the first graph neural network scalable to large-scale datasets such as Xenium (500,000 cells)
arXiv Detail & Related papers (2025-06-13T06:30:48Z) - Improving Heart Rejection Detection in XPCI Images Using Synthetic Data Augmentation [0.0]
StyleGAN was trained on available 3R biopsy patches and subsequently used to generate 10,000 realistic synthetic images.<n>These were combined with real 0R samples, that is samples without rejection, in various configurations to train ResNet-18 classifiers for binary rejection classification.<n>Results demonstrate that synthetic data improves classification performance, particularly when used in combination with real samples.
arXiv Detail & Related papers (2025-05-26T09:26:36Z) - PH2ST:ST-Prompt Guided Histological Hypergraph Learning for Spatial Gene Expression Prediction [9.420121324844066]
We propose PH2ST, a prompt-guided hypergraph learning framework, to guide multi-scale histological representation learning for spatial gene expression prediction.<n> PH2ST not only outperforms existing state-of-the-art methods, but also shows strong potential for practical applications such as imputing missing spots, ST super-resolution, and local-to-global prediction.
arXiv Detail & Related papers (2025-03-21T03:10:43Z) - HCS-TNAS: Hybrid Constraint-driven Semi-supervised Transformer-NAS for Ultrasound Image Segmentation [0.34089646689382486]
We introduce a hybrid constraint-driven semi-supervised Transformer-NAS (HCS-TNAS) for ultrasound segmentation.
HCS-TNAS includes an Efficient NAS-ViT module for multi-scale token search before ViT's attention calculation, effectively capturing contextual and local information with lower computational costs.
Experiments on public datasets show that HCS-TNAS achieves state-of-the-art performance, pushing the limit of ultrasound segmentation.
arXiv Detail & Related papers (2024-07-05T01:02:12Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - Leveraging Neural Radiance Fields for Uncertainty-Aware Visual
Localization [56.95046107046027]
We propose to leverage Neural Radiance Fields (NeRF) to generate training samples for scene coordinate regression.
Despite NeRF's efficiency in rendering, many of the rendered data are polluted by artifacts or only contain minimal information gain.
arXiv Detail & Related papers (2023-10-10T20:11:13Z) - 3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation [52.699139151447945]
We propose a novel adaptation method for transferring the segment anything model (SAM) from 2D to 3D for promptable medical image segmentation.
Our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation.
arXiv Detail & Related papers (2023-06-23T12:09:52Z) - Data-driven generation of plausible tissue geometries for realistic
photoacoustic image synthesis [53.65837038435433]
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties.
We propose a novel approach to PAT data simulation, which we refer to as "learning to simulate"
We leverage the concept of Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data to generate plausible tissue geometries.
arXiv Detail & Related papers (2021-03-29T11:30:18Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.