Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images
- URL: http://arxiv.org/abs/2503.17261v1
- Date: Fri, 21 Mar 2025 16:04:11 GMT
- Title: Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images
- Authors: Jie Mei, Chenyu Lin, Yu Qiu, Yaonan Wang, Hui Zhang, Ziyang Wang, Dong Dai,
- Abstract summary: Deep learning models are expected to address problems such as poor image quality, motion artifacts, and complex tumor morphology.<n>We introduce a large-scale PET-CT lung tumor segmentation dataset, termed PCLT20K, which comprises 21,930 pairs of PET-CT images from 605 patients.<n>We propose a cross-modal interactive perception network with Mamba (CIPA) for lung tumor segmentation in PET-CT images.
- Score: 29.523577037519985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lung cancer is a leading cause of cancer-related deaths globally. PET-CT is crucial for imaging lung tumors, providing essential metabolic and anatomical information, while it faces challenges such as poor image quality, motion artifacts, and complex tumor morphology. Deep learning-based models are expected to address these problems, however, existing small-scale and private datasets limit significant performance improvements for these methods. Hence, we introduce a large-scale PET-CT lung tumor segmentation dataset, termed PCLT20K, which comprises 21,930 pairs of PET-CT images from 605 patients. Furthermore, we propose a cross-modal interactive perception network with Mamba (CIPA) for lung tumor segmentation in PET-CT images. Specifically, we design a channel-wise rectification module (CRM) that implements a channel state space block across multi-modal features to learn correlated representations and helps filter out modality-specific noise. A dynamic cross-modality interaction module (DCIM) is designed to effectively integrate position and context information, which employs PET images to learn regional position information and serves as a bridge to assist in modeling the relationships between local features of CT images. Extensive experiments on a comprehensive benchmark demonstrate the effectiveness of our CIPA compared to the current state-of-the-art segmentation methods. We hope our research can provide more exploration opportunities for medical image segmentation. The dataset and code are available at https://github.com/mj129/CIPA.
Related papers
- Developing a PET/CT Foundation Model for Cross-Modal Anatomical and Functional Imaging [39.59895695500171]
We introduce the Cross-Fraternal Twin Masked Autoencoder (FratMAE), a novel framework that effectively integrates whole-body anatomical and functional information.<n>FratMAE captures intricate cross-modal relationships and global uptake patterns, achieving superior performance on downstream tasks.
arXiv Detail & Related papers (2025-03-04T17:49:07Z) - From FDG to PSMA: A Hitchhiker's Guide to Multitracer, Multicenter Lesion Segmentation in PET/CT Imaging [0.9384264274298444]
We present our solution for the autoPET III challenge, targeting multitracer, multicenter generalization using the nnU-Net framework with the ResEncL architecture.
Key techniques include misalignment data augmentation and multi-modal pretraining across CT, MR, and PET datasets.
Compared to the default nnU-Net, which achieved a Dice score of 57.61, our model significantly improved performance with a Dice score of 68.40, alongside a reduction in false positive (FPvol: 7.82) and false negative (FNvol: 10.35) volumes.
arXiv Detail & Related papers (2024-09-14T16:39:17Z) - Contrastive Diffusion Model with Auxiliary Guidance for Coarse-to-Fine
PET Reconstruction [62.29541106695824]
This paper presents a coarse-to-fine PET reconstruction framework that consists of a coarse prediction module (CPM) and an iterative refinement module (IRM)
By delegating most of the computational overhead to the CPM, the overall sampling speed of our method can be significantly improved.
Two additional strategies, i.e., an auxiliary guidance strategy and a contrastive diffusion strategy, are proposed and integrated into the reconstruction process.
arXiv Detail & Related papers (2023-08-20T04:10:36Z) - Classification of lung cancer subtypes on CT images with synthetic
pathological priors [41.75054301525535]
Cross-scale associations exist in the image patterns between the same case's CT images and its pathological images.
We propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on CT images.
arXiv Detail & Related papers (2023-08-09T02:04:05Z) - ISA-Net: Improved spatial attention network for PET-CT tumor
segmentation [22.48294544919023]
We propose a deep learning segmentation method based on multimodal positron emission tomography-computed tomography (PET-CT)
We design an improved spatial attention network(ISA-Net) to increase the accuracy of PET or CT in detecting tumors.
We validated the proposed ISA-Net method on two clinical datasets, a soft tissue sarcoma(STS) and a head and neck tumor(HECKTOR) dataset.
arXiv Detail & Related papers (2022-11-04T04:15:13Z) - Segmentation of Lung Tumor from CT Images using Deep Supervision [0.8733639720576208]
Lung cancer is a leading cause of death in most countries of the world.
This paper approaches lung tumor segmentation by applying two-dimensional discrete wavelet transform (DWT) on the LOTUS dataset.
arXiv Detail & Related papers (2021-11-17T17:50:18Z) - Predicting Distant Metastases in Soft-Tissue Sarcomas from PET-CT scans
using Constrained Hierarchical Multi-Modality Feature Learning [14.60163613315816]
Distant metastases (DM) are the leading cause of death in patients with soft-tissue sarcomas (STSs)
It is difficult to determine from imaging studies which STS patients will develop metastases.
We outline a new 3D CNN to help predict DM in patients from PET-CT data.
arXiv Detail & Related papers (2021-04-23T05:12:02Z) - A Multi-Stage Attentive Transfer Learning Framework for Improving
COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis.
Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains.
Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z) - M3Lung-Sys: A Deep Learning System for Multi-Class Lung Pneumonia
Screening from CT Imaging [85.00066186644466]
We propose a Multi-task Multi-slice Deep Learning System (M3Lung-Sys) for multi-class lung pneumonia screening from CT imaging.
In addition to distinguish COVID-19 from Healthy, H1N1, and CAP cases, our M 3 Lung-Sys also be able to locate the areas of relevant lesions.
arXiv Detail & Related papers (2020-10-07T06:22:24Z) - Multimodal Spatial Attention Module for Targeting Multimodal PET-CT Lung
Tumor Segmentation [11.622615048002567]
Multimodal spatial attention module (MSAM) learns to emphasize regions related to tumors.
MSAM can be applied to common backbone architectures and trained end-to-end.
arXiv Detail & Related papers (2020-07-29T10:27:22Z) - Co-Heterogeneous and Adaptive Segmentation from Multi-Source and
Multi-Phase CT Imaging Data: A Study on Pathological Liver and Lesion
Segmentation [48.504790189796836]
We present a novel segmentation strategy, co-heterogenous and adaptive segmentation (CHASe)
We propose a versatile framework that fuses appearance based semi-supervision, mask based adversarial domain adaptation, and pseudo-labeling.
CHASe can further improve pathological liver mask Dice-Sorensen coefficients by ranges of $4.2% sim 9.4%$.
arXiv Detail & Related papers (2020-05-27T06:58:39Z) - Synergistic Learning of Lung Lobe Segmentation and Hierarchical
Multi-Instance Classification for Automated Severity Assessment of COVID-19
in CT Images [61.862364277007934]
We propose a synergistic learning framework for automated severity assessment of COVID-19 in 3D CT images.
A multi-task deep network (called M$2$UNet) is then developed to assess the severity of COVID-19 patients.
Our M$2$UNet consists of a patch-level encoder, a segmentation sub-network for lung lobe segmentation, and a classification sub-network for severity assessment.
arXiv Detail & Related papers (2020-05-08T03:16:15Z) - Residual Attention U-Net for Automated Multi-Class Segmentation of
COVID-19 Chest CT Images [46.844349956057776]
coronavirus disease 2019 (COVID-19) has been spreading rapidly around the world and caused significant impact on the public health and economy.
There is still lack of studies on effectively quantifying the lung infection caused by COVID-19.
We propose a novel deep learning algorithm for automated segmentation of multiple COVID-19 infection regions.
arXiv Detail & Related papers (2020-04-12T16:24:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.