Multimodal Slice Interaction Network Enhanced by Transfer Learning for Precise Segmentation of Internal Gross Tumor Volume in Lung Cancer PET/CT Imaging
- URL: http://arxiv.org/abs/2509.22841v1
- Date: Fri, 26 Sep 2025 18:48:08 GMT
- Title: Multimodal Slice Interaction Network Enhanced by Transfer Learning for Precise Segmentation of Internal Gross Tumor Volume in Lung Cancer PET/CT Imaging
- Authors: Yi Luo, Yike Guo, Hamed Hooshangnejad, Rui Zhang, Xue Feng, Quan Chen, Wil Ngwa, Kai Ding,
- Abstract summary: Internal gross tumor volume (IGTV) in PET/CT imaging is pivotal for optimal radiation therapy in lung cancer tumors.<n>We present a transfer learningbased methodology utilizing a multimodal interactive perception network with MAMBA.<n>We introduce a slice interaction module (SIM) within a 2.5D segmentation framework to effectively model inter-slice relationships.
- Score: 34.37798183254656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lung cancer remains the leading cause of cancerrelated deaths globally. Accurate delineation of internal gross tumor volume (IGTV) in PET/CT imaging is pivotal for optimal radiation therapy in mobile tumors such as lung cancer to account for tumor motion, yet is hindered by the limited availability of annotated IGTV datasets and attenuated PET signal intensity at tumor boundaries. In this study, we present a transfer learningbased methodology utilizing a multimodal interactive perception network with MAMBA, pre-trained on extensive gross tumor volume (GTV) datasets and subsequently fine-tuned on a private IGTV cohort. This cohort constitutes the PET/CT subset of the Lung-cancer Unified Cross-modal Imaging Dataset (LUCID). To further address the challenge of weak PET intensities in IGTV peripheral slices, we introduce a slice interaction module (SIM) within a 2.5D segmentation framework to effectively model inter-slice relationships. Our proposed module integrates channel and spatial attention branches with depthwise convolutions, enabling more robust learning of slice-to-slice dependencies and thereby improving overall segmentation performance. A comprehensive experimental evaluation demonstrates that our approach achieves a Dice of 0.609 on the private IGTV dataset, substantially surpassing the conventional baseline score of 0.385. This work highlights the potential of transfer learning, coupled with advanced multimodal techniques and a SIM to enhance the reliability and clinical relevance of IGTV segmentation for lung cancer radiation therapy planning.
Related papers
- OmniCT: Towards a Unified Slice-Volume LVLM for Comprehensive CT Analysis [53.01523944168442]
Clinical interpretation relies on both slice-driven local features and volume-driven spatial representations.<n>Existing Large Vision-Language Models (LVLMs) remain fragmented in CT slice versus volumetric understanding.<n>We present OmniCT, a powerful unified slice-volume LVLM for CT scenarios.
arXiv Detail & Related papers (2026-02-18T00:42:41Z) - Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation [37.40806731129113]
vMambaX is a lightweight framework integrating PET and CT scan images through a Context-Gated Cross-Modal Perception Module.<n> evaluated on the PCLT20K dataset, the model outperforms baseline models while maintaining lower computational complexity.
arXiv Detail & Related papers (2025-10-31T14:29:52Z) - DRBD-Mamba for Robust and Efficient Brain Tumor Segmentation with Analytical Insights [54.87947751720332]
Accurate brain tumor segmentation is significant for clinical diagnosis and treatment.<n>Mamba-based State Space Models have demonstrated promising performance.<n>We propose a dual-resolution bi-directional Mamba that captures multi-scale long-range dependencies with minimal computational overhead.
arXiv Detail & Related papers (2025-10-16T07:31:21Z) - Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images [29.523577037519985]
Deep learning models are expected to address problems such as poor image quality, motion artifacts, and complex tumor morphology.<n>We introduce a large-scale PET-CT lung tumor segmentation dataset, termed PCLT20K, which comprises 21,930 pairs of PET-CT images from 605 patients.<n>We propose a cross-modal interactive perception network with Mamba (CIPA) for lung tumor segmentation in PET-CT images.
arXiv Detail & Related papers (2025-03-21T16:04:11Z) - MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts [54.915060471994686]
We propose MAST-Pro, a novel framework that integrates dynamic Mixture-of-Experts (D-MoE) and knowledge-driven prompts for pan-tumor segmentation.<n>Specifically, text and anatomical prompts provide domain-specific priors guiding tumor representation learning, while D-MoE dynamically selects experts to balance generic and tumor-specific feature learning.<n>Experiments on multi-anatomical tumor datasets demonstrate that MAST-Pro outperforms state-of-the-art approaches, achieving up to a 5.20% improvement in average improvement while reducing trainable parameters by 91.04%, without compromising accuracy.
arXiv Detail & Related papers (2025-03-18T15:39:44Z) - Developing a PET/CT Foundation Model for Cross-Modal Anatomical and Functional Imaging [39.59895695500171]
We introduce the Cross-Fraternal Twin Masked Autoencoder (FratMAE), a novel framework that effectively integrates whole-body anatomical and functional information.<n>FratMAE captures intricate cross-modal relationships and global uptake patterns, achieving superior performance on downstream tasks.
arXiv Detail & Related papers (2025-03-04T17:49:07Z) - Cross-modality Guidance-aided Multi-modal Learning with Dual Attention
for MRI Brain Tumor Grading [47.50733518140625]
Brain tumor represents one of the most fatal cancers around the world, and is very common in children and the elderly.
We propose a novel cross-modality guidance-aided multi-modal learning with dual attention for addressing the task of MRI brain tumor grading.
arXiv Detail & Related papers (2024-01-17T07:54:49Z) - MEDPSeg: Hierarchical polymorphic multitask learning for the segmentation of ground-glass opacities, consolidation, and pulmonary structures on computed tomography [37.119000111386924]
MEDPSeg learns from heterogeneous chest CT targets through hierarchical polymorphic multitask learning (HPML)
We show PML enabling new state-of-the-art performance for GGO and consolidation segmentation tasks.
In addition, MEDPSeg simultaneously performs segmentation of the lung parenchyma, airways, pulmonary artery, and lung lesions, all in a single forward prediction.
arXiv Detail & Related papers (2023-12-04T21:46:39Z) - A Localization-to-Segmentation Framework for Automatic Tumor
Segmentation in Whole-Body PET/CT Images [8.0523823243864]
This paper proposes a localization-to-segmentation framework (L2SNet) for precise tumor segmentation.
L2SNet first localizes the possible lesions in the lesion localization phase and then uses the location cues to shape the segmentation results in the lesion segmentation phase.
Experiments with the MII Automated Lesion in Whole-Body FDG-PET/CT challenge dataset show that our method achieved a competitive result.
arXiv Detail & Related papers (2023-09-11T13:39:15Z) - Improved Prognostic Prediction of Pancreatic Cancer Using Multi-Phase CT
by Integrating Neural Distance and Texture-Aware Transformer [37.55853672333369]
This paper proposes a novel learnable neural distance that describes the precise relationship between the tumor and vessels in CT images of different patients.
The developed risk marker was the strongest predictor of overall survival among preoperative factors.
arXiv Detail & Related papers (2023-08-01T12:46:02Z) - ISA-Net: Improved spatial attention network for PET-CT tumor
segmentation [22.48294544919023]
We propose a deep learning segmentation method based on multimodal positron emission tomography-computed tomography (PET-CT)
We design an improved spatial attention network(ISA-Net) to increase the accuracy of PET or CT in detecting tumors.
We validated the proposed ISA-Net method on two clinical datasets, a soft tissue sarcoma(STS) and a head and neck tumor(HECKTOR) dataset.
arXiv Detail & Related papers (2022-11-04T04:15:13Z) - Federated Learning Enables Big Data for Rare Cancer Boundary Detection [98.5549882883963]
We present findings from the largest Federated ML study to-date, involving data from 71 healthcare institutions across 6 continents.
We generate an automatic tumor boundary detector for the rare disease of glioblastoma.
We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent.
arXiv Detail & Related papers (2022-04-22T17:27:00Z) - Segmentation of Lung Tumor from CT Images using Deep Supervision [0.8733639720576208]
Lung cancer is a leading cause of death in most countries of the world.
This paper approaches lung tumor segmentation by applying two-dimensional discrete wavelet transform (DWT) on the LOTUS dataset.
arXiv Detail & Related papers (2021-11-17T17:50:18Z) - Multimodal Spatial Attention Module for Targeting Multimodal PET-CT Lung
Tumor Segmentation [11.622615048002567]
Multimodal spatial attention module (MSAM) learns to emphasize regions related to tumors.
MSAM can be applied to common backbone architectures and trained end-to-end.
arXiv Detail & Related papers (2020-07-29T10:27:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.