Related papers: H2ASeg: Hierarchical Adaptive Interaction and Weighting Network for Tumor Segmentation in PET/CT Images

H2ASeg: Hierarchical Adaptive Interaction and Weighting Network for Tumor Segmentation in PET/CT Images

URL: http://arxiv.org/abs/2403.18339v2
Date: Thu, 28 Mar 2024 11:46:25 GMT
Title: H2ASeg: Hierarchical Adaptive Interaction and Weighting Network for Tumor Segmentation in PET/CT Images
Authors: Jinpeng Lu, Jingyun Chen, Linghan Cai, Songhan Jiang, Yongbing Zhang,
Abstract summary: Positron emission tomography (PET) combined with computed tomography (CT) imaging is routinely used in cancer diagnosis and prognosis. Traditional multi-modal segmentation solutions rely on concatenation operations for modality fusion. We propose a Hierarchical Adaptive Interaction and Weighting Network termed H2ASeg to explore intrinsic cross-modal correlations.
Score: 6.753315684414596
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Positron emission tomography (PET) combined with computed tomography (CT) imaging is routinely used in cancer diagnosis and prognosis by providing complementary information. Automatically segmenting tumors in PET/CT images can significantly improve examination efficiency. Traditional multi-modal segmentation solutions mainly rely on concatenation operations for modality fusion, which fail to effectively model the non-linear dependencies between PET and CT modalities. Recent studies have investigated various approaches to optimize the fusion of modality-specific features for enhancing joint representations. However, modality-specific encoders used in these methods operate independently, inadequately leveraging the synergistic relationships inherent in PET and CT modalities, for example, the complementarity between semantics and structure. To address these issues, we propose a Hierarchical Adaptive Interaction and Weighting Network termed H2ASeg to explore the intrinsic cross-modal correlations and transfer potential complementary information. Specifically, we design a Modality-Cooperative Spatial Attention (MCSA) module that performs intra- and inter-modal interactions globally and locally. Additionally, a Target-Aware Modality Weighting (TAMW) module is developed to highlight tumor-related features within multi-modal features, thereby refining tumor segmentation. By embedding these modules across different layers, H2ASeg can hierarchically model cross-modal correlations, enabling a nuanced understanding of both semantic and structural tumor features. Extensive experiments demonstrate the superiority of H2ASeg, outperforming state-of-the-art methods on AutoPet-II and Hecktor2022 benchmarks. The code is released at https://github.com/JinPLu/H2ASeg.

Related papers

Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images [29.523577037519985]
Deep learning models are expected to address problems such as poor image quality, motion artifacts, and complex tumor morphology. We introduce a large-scale PET-CT lung tumor segmentation dataset, termed PCLT20K, which comprises 21,930 pairs of PET-CT images from 605 patients. We propose a cross-modal interactive perception network with Mamba (CIPA) for lung tumor segmentation in PET-CT images.
arXiv Detail & Related papers (2025-03-21T16:04:11Z)
Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology [6.634579989129392]
We propose a dual fusion framework that integrates omic data at both early and late stages. In the early fusion stage, omic embeddings are projected into a patch-wise latent space, generating omic-WSI embeddings. In the late fusion stage, we reintroduce the omic data by fusing it with slide-level omic-WSI embeddings.
arXiv Detail & Related papers (2024-11-26T13:25:53Z)
PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation. Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process. Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z)
Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI [58.809276442508256]
We propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers. The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network superior performance than the state-of-the-art methods.
arXiv Detail & Related papers (2024-08-11T15:46:00Z)
Multi-modal Evidential Fusion Network for Trusted PET/CT Tumor Segmentation [5.839660501978193]
The quality of PET and CT images varies widely in clinical settings, which leads to uncertainty in the modality information extracted by networks. This paper proposes a novel Multi-modal Evidential Fusion Network (MEFN) comprising a Cross-Modal Feature Learning (CFL) module and a Multi-modal Trusted Fusion (MTF) module. Our model can provide radiologists with credible uncertainty of the segmentation results for their decision in accepting or rejecting the automatic segmentation results.
arXiv Detail & Related papers (2024-06-26T13:14:24Z)
Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems. This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z)
Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis. We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z)
ISA-Net: Improved spatial attention network for PET-CT tumor segmentation [22.48294544919023]
We propose a deep learning segmentation method based on multimodal positron emission tomography-computed tomography (PET-CT) We design an improved spatial attention network(ISA-Net) to increase the accuracy of PET or CT in detecting tumors. We validated the proposed ISA-Net method on two clinical datasets, a soft tissue sarcoma(STS) and a head and neck tumor(HECKTOR) dataset.
arXiv Detail & Related papers (2022-11-04T04:15:13Z)
InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images [53.4351366246531]
We construct a novel interpretable dual domain network, termed InDuDoNet+, into which CT imaging process is finely embedded. We analyze the CT values among different tissues, and merge the prior observations into a prior network for our InDuDoNet+, which significantly improve its generalization performance.
arXiv Detail & Related papers (2021-12-23T15:52:37Z)
Cross-Modality Brain Tumor Segmentation via Bidirectional Global-to-Local Unsupervised Domain Adaptation [61.01704175938995]
In this paper, we propose a novel Bidirectional Global-to-Local (BiGL) adaptation framework under a UDA scheme. Specifically, a bidirectional image synthesis and segmentation module is proposed to segment the brain tumor. The proposed method outperforms several state-of-the-art unsupervised domain adaptation methods by a large margin.
arXiv Detail & Related papers (2021-05-17T10:11:45Z)
Soft Tissue Sarcoma Co-Segmentation in Combined MRI and PET/CT Data [2.2515303891664358]
Tumor segmentation in multimodal medical images has seen a growing trend towards deep learning based methods. We propose a simultaneous co-segmentation method, which enables multimodal feature learning through modality-specific encoder and decoder branches. We demonstrate the effectiveness of our approach on public soft tissue sarcoma data, which comprises MRI (T1 and T2 sequence) and PET/CT scans.
arXiv Detail & Related papers (2020-08-28T09:15:42Z)
Multimodal Spatial Attention Module for Targeting Multimodal PET-CT Lung Tumor Segmentation [11.622615048002567]
Multimodal spatial attention module (MSAM) learns to emphasize regions related to tumors. MSAM can be applied to common backbone architectures and trained end-to-end.
arXiv Detail & Related papers (2020-07-29T10:27:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.