PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer
- URL: http://arxiv.org/abs/2512.03111v1
- Date: Tue, 02 Dec 2025 08:31:31 GMT
- Title: PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer
- Authors: Xiaoshui Huang, Tianlin Zhu, Yifan Zuo, Xue Xia, Zonghan Wu, Jiebin Yan, Dingli Hua, Zongyi Xu, Yuming Fang, Jian Zhang,
- Abstract summary: We introduce PanFoMa, a lightweight hybrid neural network that combines the strengths of Transformers and state-space models.<n>PanFoMa consists of a front-end local-context encoder with shared self-attention layers to capture complex, order-independent gene interactions.<n>We also construct a large-scale pan-cancer single-cell benchmark, PanFoMaBench, containing over 3.5 million high-quality cells.
- Score: 54.958921946378304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single-cell RNA sequencing (scRNA-seq) is essential for decoding tumor heterogeneity. However, pan-cancer research still faces two key challenges: learning discriminative and efficient single-cell representations, and establishing a comprehensive evaluation benchmark. In this paper, we introduce PanFoMa, a lightweight hybrid neural network that combines the strengths of Transformers and state-space models to achieve a balance between performance and efficiency. PanFoMa consists of a front-end local-context encoder with shared self-attention layers to capture complex, order-independent gene interactions; and a back-end global sequential feature decoder that efficiently integrates global context using a linear-time state-space model. This modular design preserves the expressive power of Transformers while leveraging the scalability of Mamba to enable transcriptome modeling, effectively capturing both local and global regulatory signals. To enable robust evaluation, we also construct a large-scale pan-cancer single-cell benchmark, PanFoMaBench, containing over 3.5 million high-quality cells across 33 cancer subtypes, curated through a rigorous preprocessing pipeline. Experimental results show that PanFoMa outperforms state-of-the-art models on our pan-cancer benchmark (+4.0\%) and across multiple public tasks, including cell type annotation (+7.4\%), batch integration (+4.0\%) and multi-omics integration (+3.1\%). The code is available at https://github.com/Xiaoshui-Huang/PanFoMa.
Related papers
- PanopMamba: Vision State Space Modeling for Nuclei Panoptic Segmentation [20.689908446030856]
PanopMamba is a novel hybrid encoder-decoder architecture that integrates Mamba and Transformer.<n>To the best of our knowledge, this is the first Mamba-based approach for panoptic segmentation.<n>We introduce alternative evaluation metrics, including image-level Panoptic Quality ($i$PQ), boundary-weighted PQ ($w$PQ), and frequency-weighted PQ ($fw$PQ)
arXiv Detail & Related papers (2026-01-23T10:33:15Z) - VesSAM: Efficient Multi-Prompting for Segmenting Complex Vessel [68.24765319399286]
We present VesSAM, a powerful and efficient framework tailored for 2D vessel segmentation.<n>VesSAM integrates (1) a convolutional adapter to enhance local texture features, (2) a multi-prompt encoder that fuses anatomical prompts, and (3) a lightweight mask decoder to reduce jagged artifacts.<n>VesSAM consistently outperforms state-of-the-art PEFT-based SAM variants by over 10% Dice and 13% IoU.
arXiv Detail & Related papers (2025-11-02T15:47:05Z) - scMRDR: A scalable and flexible framework for unpaired single-cell multi-omics data integration [53.683726781791385]
We introduce a scalable and flexible generative framework called single-cell Multi-omics Regularized Disentangled Representations (scMRDR) for unpaired multi-omics integration.<n>Our method achieves excellent performance on benchmark datasets in terms of batch correction, modality alignment, and biological signal preservation.
arXiv Detail & Related papers (2025-10-28T21:28:39Z) - Multi-Scale Deep Learning for Colon Histopathology: A Hybrid Graph-Transformer Approach [0.0]
Colon cancer also known as Colorectal cancer, is one of the most malignant types of cancer worldwide.<n>This research presents a hybrid multi-scale deep learning architecture that synergizes capsule networks, graph attention mechanisms, transformer modules, and residual learning to advance colon cancer classification.
arXiv Detail & Related papers (2025-09-02T21:40:57Z) - HybridTM: Combining Transformer and Mamba for 3D Semantic Segmentation [7.663855540620183]
We propose HybridTM, the first hybrid architecture that integrates Transformer and Mamba for 3D semantic segmentation.<n>In addition, we propose the Inner Layer Hybrid Strategy, which combines attention and Mamba at a finer granularity.<n>Our HybridTM achieves state-of-the-art performance on ScanNet, ScanNet200, and nuScenes benchmarks.
arXiv Detail & Related papers (2025-07-24T16:48:50Z) - Multi-branch of Attention Yields Accurate Results for Tabular Data [8.017123125747258]
We propose MAYA, an encoder-decoder transformer-based framework.<n>In the encoder, we design a Multi-Branch of Attention (MBA) that constructs multiple parallel attention branches.<n>We employ collaborative learning with a dynamic consistency weight constraint to produce more robust representations.
arXiv Detail & Related papers (2025-02-18T03:43:42Z) - ContextFormer: Redefining Efficiency in Semantic Segmentation [48.81126061219231]
Convolutional methods, although capturing local dependencies well, struggle with long-range relationships.<n>Vision Transformers (ViTs) excel in global context capture but are hindered by high computational demands.<n>We propose ContextFormer, a hybrid framework leveraging the strengths of CNNs and ViTs in the bottleneck to balance efficiency, accuracy, and robustness for real-time semantic segmentation.
arXiv Detail & Related papers (2025-01-31T16:11:04Z) - MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation [0.12499537119440242]
Skin cancer segmentation poses a significant challenge in medical image analysis.
MobileUNETR aims to overcome the performance constraints associated with both CNNs and Transformers.
MobileUNETR achieves superior performance with 3 million parameters and a computational complexity of 1.3 GFLOP.
arXiv Detail & Related papers (2024-09-04T20:23:37Z) - Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI [58.809276442508256]
We propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers.
The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network superior performance than the state-of-the-art methods.
arXiv Detail & Related papers (2024-08-11T15:46:00Z) - Breast Ultrasound Tumor Classification Using a Hybrid Multitask
CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z) - Global Filter Networks for Image Classification [90.81352483076323]
We present a conceptually simple yet computationally efficient architecture that learns long-term spatial dependencies in the frequency domain with log-linear complexity.
Our results demonstrate that GFNet can be a very competitive alternative to transformer-style models and CNNs in efficiency, generalization ability and robustness.
arXiv Detail & Related papers (2021-07-01T17:58:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.