H-CNN-ViT: A Hierarchical Gated Attention Multi-Branch Model for Bladder Cancer Recurrence Prediction
- URL: http://arxiv.org/abs/2511.13869v2
- Date: Wed, 19 Nov 2025 04:47:54 GMT
- Title: H-CNN-ViT: A Hierarchical Gated Attention Multi-Branch Model for Bladder Cancer Recurrence Prediction
- Authors: Xueyang Li, Zongren Wang, Yuliang Zhang, Zixuan Pan, Yu-Jen Chen, Nishchal Sapkota, Gelei Xu, Danny Z. Chen, Yiyu Shi,
- Abstract summary: We introduce a curated multi-sequence, multi-modal MRI dataset specifically designed for bladder cancer recurrence prediction.<n>We then propose H-CNN-ViT, a new Hierarchical Gated Attention Multi-Branch model that enables selective weighting of features from the global (ViT) and local (CNN) paths.<n>Our multi-branch architecture processes each modality independently, ensuring that the unique properties of each imaging channel are optimally captured and integrated.
- Score: 17.67324034146389
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bladder cancer is one of the most prevalent malignancies worldwide, with a recurrence rate of up to 78%, necessitating accurate post-operative monitoring for effective patient management. Multi-sequence contrast-enhanced MRI is commonly used for recurrence detection; however, interpreting these scans remains challenging, even for experienced radiologists, due to post-surgical alterations such as scarring, swelling, and tissue remodeling. AI-assisted diagnostic tools have shown promise in improving bladder cancer recurrence prediction, yet progress in this field is hindered by the lack of dedicated multi-sequence MRI datasets for recurrence assessment study. In this work, we first introduce a curated multi-sequence, multi-modal MRI dataset specifically designed for bladder cancer recurrence prediction, establishing a valuable benchmark for future research. We then propose H-CNN-ViT, a new Hierarchical Gated Attention Multi-Branch model that enables selective weighting of features from the global (ViT) and local (CNN) paths based on contextual demands, achieving a balanced and targeted feature fusion. Our multi-branch architecture processes each modality independently, ensuring that the unique properties of each imaging channel are optimally captured and integrated. Evaluated on our dataset, H-CNN-ViT achieves an AUC of 78.6%, surpassing state-of-the-art models. Our model is publicly available at https://github.com/XLIAaron/H-CNN-ViT.
Related papers
- A Contrastive Variational AutoEncoder for NSCLC Survival Prediction with Missing Modalities [41.8469011437549]
Predicting survival outcomes for non-small cell lung cancer (NSCLC) patients is challenging due to the different individual prognostic features.<n>State-of-the-art models rely on available data to create patient-level representations or use generative models to infer missing modalities.<n>We propose a Multimodal Contrastive Variational AutoEncoder (MCVAE) to address this issue.
arXiv Detail & Related papers (2026-02-19T14:29:34Z) - Towards Label-Free Brain Tumor Segmentation: Unsupervised Learning with Multimodal MRI [7.144319861722029]
Unsupervised anomaly detection (UAD) presents a complementary alternative to supervised learning for brain tumor segmentation in MRI.<n>We propose a novel Multimodal Vision Transformer Autoencoder (MViT-AE) trained exclusively on healthy brain MRIs to detect and localize tumors.<n>Our method achieves clinically meaningful tumor localization, with lesion-wise Dice Similarity Coefficient of 0.437 (Whole Tumor), 0.316 (Tumor Core), and 0.350 (Enhancing Tumor) on the test set, and an anomaly Detection Rate of 89.4% on the validation set.
arXiv Detail & Related papers (2025-10-17T14:26:30Z) - Automatic and standardized surgical reporting for central nervous system tumors [0.2634932446012777]
The pipeline presented in this study enables robust, automated segmentation, MR sequence classification, and standardized report generation.<n>The proposed models and methods were integrated into Raidionics, open-source software platform for CNS tumor analysis, now including a dedicated module for postsurgical analysis.
arXiv Detail & Related papers (2025-08-12T13:08:49Z) - impuTMAE: Multi-modal Transformer with Masked Pre-training for Missing Modalities Imputation in Cancer Survival Prediction [75.43342771863837]
We introduce impuTMAE, a novel transformer-based end-to-end approach with an efficient multimodal pre-training strategy.<n>It learns inter- and intra-modal interactions while simultaneously imputing missing modalities by reconstructing masked patches.<n>Our model is pre-trained on heterogeneous, incomplete data and fine-tuned for glioma survival prediction using TCGA-GBM/LGG and BraTS datasets.
arXiv Detail & Related papers (2025-08-08T10:01:16Z) - Unified HT-CNNs Architecture: Transfer Learning for Segmenting Diverse Brain Tumors in MRI from Gliomas to Pediatric Tumors [2.104687387907779]
We introduce HT-CNNs, an ensemble of Hybrid Transformers and Convolutional Neural Networks optimized through transfer learning for varied brain tumor segmentation.<n>This method captures spatial and contextual details from MRI data, fine-tuned on diverse datasets representing common tumor types.<n>Our findings underscore the potential of transfer learning and ensemble approaches in medical image segmentation, indicating a substantial enhancement in clinical decision-making and patient care.
arXiv Detail & Related papers (2024-12-11T09:52:01Z) - Cross-modality Guidance-aided Multi-modal Learning with Dual Attention
for MRI Brain Tumor Grading [47.50733518140625]
Brain tumor represents one of the most fatal cancers around the world, and is very common in children and the elderly.
We propose a novel cross-modality guidance-aided multi-modal learning with dual attention for addressing the task of MRI brain tumor grading.
arXiv Detail & Related papers (2024-01-17T07:54:49Z) - Guided Reconstruction with Conditioned Diffusion Models for Unsupervised Anomaly Detection in Brain MRIs [35.46541584018842]
Unsupervised Anomaly Detection (UAD) aims to identify any anomaly as an outlier from a healthy training distribution.<n>generative models are used to learn the reconstruction of healthy brain anatomy for a given input image.<n>We propose conditioning the denoising process of diffusion models with additional information derived from a latent representation of the input image.
arXiv Detail & Related papers (2023-12-07T11:03:42Z) - Automated ensemble method for pediatric brain tumor segmentation [0.0]
This study introduces a novel ensemble approach using ONet and modified versions of UNet.
Data augmentation ensures robustness and accuracy across different scanning protocols.
Results indicate that this advanced ensemble approach offers promising prospects for enhanced diagnostic accuracy.
arXiv Detail & Related papers (2023-08-14T15:29:32Z) - Segmentation of glioblastomas in early post-operative multi-modal MRI
with deep neural networks [33.51490233427579]
Two state-of-the-art neural network architectures for pre-operative segmentation were trained for the task.
The best performance achieved was a 61% Dice score, and the best classification performance was about 80% balanced accuracy.
The predicted segmentations can be used to accurately classify the patients into those with residual tumor, and those with gross total resection.
arXiv Detail & Related papers (2023-04-18T10:14:45Z) - Automated Segmentation and Recurrence Risk Prediction of Surgically
Resected Lung Tumors with Adaptive Convolutional Neural Networks [3.5413688566798096]
Lung cancer is the leading cause of cancer related mortality by a significant margin.
In this paper, we explore the use of convolutional neural networks (CNNs) for the segmentation and recurrence risk prediction of lung tumors.
To the best of our knowledge, it is the first fully automated segmentation and recurrence risk prediction system.
arXiv Detail & Related papers (2022-09-17T23:06:22Z) - Implementation of Convolutional Neural Network Architecture on 3D
Multiparametric Magnetic Resonance Imaging for Prostate Cancer Diagnosis [0.0]
We propose a novel deep learning approach for automatic classification of prostate lesions in magnetic resonance images.
Our framework achieved the classification performance with the area under a Receiver Operating Characteristic curve value of 0.87.
Our proposed framework reflects the potential of assisting medical image interpretation in prostate cancer and reducing unnecessary biopsies.
arXiv Detail & Related papers (2021-12-29T16:47:52Z) - The Brain Tumor Sequence Registration (BraTS-Reg) Challenge: Establishing Correspondence Between Pre-Operative and Follow-up MRI Scans of Diffuse Glioma Patients [31.567542945171834]
We describe the Brain Tumor Sequence Registration (BraTS-Reg) challenge.
BraTS-Reg is the first public benchmark environment for deformable registration algorithms.
The aim of BraTS-Reg is to continue to serve as an active resource for research.
arXiv Detail & Related papers (2021-12-13T19:25:16Z) - M2Net: Multi-modal Multi-channel Network for Overall Survival Time
Prediction of Brain Tumor Patients [151.4352001822956]
Early and accurate prediction of overall survival (OS) time can help to obtain better treatment planning for brain tumor patients.
Existing prediction methods rely on radiomic features at the local lesion area of a magnetic resonance (MR) volume.
We propose an end-to-end OS time prediction model; namely, Multi-modal Multi-channel Network (M2Net)
arXiv Detail & Related papers (2020-06-01T05:21:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.