F3-Net: Foundation Model for Full Abnormality Segmentation of Medical Images with Flexible Input Modality Requirement
- URL: http://arxiv.org/abs/2507.08460v1
- Date: Fri, 11 Jul 2025 10:03:23 GMT
- Title: F3-Net: Foundation Model for Full Abnormality Segmentation of Medical Images with Flexible Input Modality Requirement
- Authors: Seyedeh Sahar Taheri Otaghsara, Reza Rahmanzadeh,
- Abstract summary: F3-Net is a foundation model designed to overcome persistent challenges in clinical medical image segmentation.<n>Its unified architecture supports multi-pathology segmentation across glioma, metastasis, stroke, and white matter lesions without retraining.<n>On the whole pathology, F3-Net achieves average Dice Similarity Coefficients (DSCs) of 0.94 for BraTS-GLI 2024, 0.82 for BraTS-MET 2024, 0.94 for BraTS 2021, and 0.79 for ISLES 2022.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: F3-Net is a foundation model designed to overcome persistent challenges in clinical medical image segmentation, including reliance on complete multimodal inputs, limited generalizability, and narrow task specificity. Through flexible synthetic modality training, F3-Net maintains robust performance even in the presence of missing MRI sequences, leveraging a zero-image strategy to substitute absent modalities without relying on explicit synthesis networks, thereby enhancing real-world applicability. Its unified architecture supports multi-pathology segmentation across glioma, metastasis, stroke, and white matter lesions without retraining, outperforming CNN-based and transformer-based models that typically require disease-specific fine-tuning. Evaluated on diverse datasets such as BraTS 2021, BraTS 2024, and ISLES 2022, F3-Net demonstrates strong resilience to domain shifts and clinical heterogeneity. On the whole pathology dataset, F3-Net achieves average Dice Similarity Coefficients (DSCs) of 0.94 for BraTS-GLI 2024, 0.82 for BraTS-MET 2024, 0.94 for BraTS 2021, and 0.79 for ISLES 2022. This positions it as a versatile, scalable solution bridging the gap between deep learning research and practical clinical deployment.
Related papers
- Federated Learning for Cross-Modality Medical Image Segmentation via Augmentation-Driven Generalization [0.0]
In this work, we consider a realistic FL scenario where each client holds single-modality data (CT or MRI)<n>We evaluate convolution-based spatial augmentation, frequency-domain manipulation, domain-specific normalization, and global intensity nonlinear (GIN) augmentation.<n>Our federated approach achieves 93-98% of centralized training accuracy, demonstrating strong cross-modality generalization without compromising data privacy.
arXiv Detail & Related papers (2026-02-24T11:13:01Z) - A Contrastive Variational AutoEncoder for NSCLC Survival Prediction with Missing Modalities [41.8469011437549]
Predicting survival outcomes for non-small cell lung cancer (NSCLC) patients is challenging due to the different individual prognostic features.<n>State-of-the-art models rely on available data to create patient-level representations or use generative models to infer missing modalities.<n>We propose a Multimodal Contrastive Variational AutoEncoder (MCVAE) to address this issue.
arXiv Detail & Related papers (2026-02-19T14:29:34Z) - AMGFormer: Adaptive Multi-Granular Transformer for Brain Tumor Segmentation with Missing Modalities [6.461582089537306]
We propose AMGFormer, achieving significantly improved stability through three synergistic modules.<n>On BraTS 2018, our method achieves 89.33% WT, 82.70% TC, 67.23% ET Dice scores with 0.5% variance across 15 modality combinations.<n>Single-modality ET segmentation shows 40-81% relative improvements over state-of-the-art methods.
arXiv Detail & Related papers (2026-01-27T08:29:02Z) - Transparent Early ICU Mortality Prediction with Clinical Transformer and Per-Case Modality Attribution [42.85462513661566]
We present a lightweight, transparent multimodal ensemble that fuses physiological time-series measurements with unstructured clinical notes from the first 48 hours of an ICU stay.<n>A logistic regression model combines predictions from two modality-specific models: a bidirectional LSTM for vitals and a finetuned ClinicalModernBERT transformer for notes.<n>On the MIMIC-III benchmark, our late-fusion ensemble improves discrimination over the best single model while maintaining well-calibrated predictions.
arXiv Detail & Related papers (2025-11-19T20:11:49Z) - SYNAPSE-Net: A Unified Framework with Lesion-Aware Hierarchical Gating for Robust Segmentation of Heterogeneous Brain Lesions [0.23332469289621785]
We propose the Unified Multi-Stream SYNAPSE-Net, an adaptive framework designed for both generalization and robustness.<n>The framework is built on a novel hybrid architecture integrating multi-stream CNN encoders, a Swin Transformer bottleneck for global context, and a dynamic cross-modal attention fusion mechanism.<n>The model was evaluated on three different challenging public datasets.
arXiv Detail & Related papers (2025-10-30T19:40:42Z) - MedSeqFT: Sequential Fine-tuning Foundation Models for 3D Medical Image Segmentation [55.37355146924576]
MedSeqFT is a sequential fine-tuning framework for medical image analysis.<n>It adapts pre-trained models to new tasks while refining their representational capacity.<n>It consistently outperforms state-of-the-art fine-tuning strategies.
arXiv Detail & Related papers (2025-09-07T15:22:53Z) - FedGIN: Federated Learning with Dynamic Global Intensity Non-linear Augmentation for Organ Segmentation using Multi-modal Images [0.0]
Medical image segmentation plays a crucial role in AI-assisted diagnostics, surgical planning, and treatment monitoring.<n>We propose FedGIN, a Federated Learning framework that enables multimodal organ segmentation without sharing raw patient data.
arXiv Detail & Related papers (2025-08-07T08:16:35Z) - FUTransUNet-GradCAM: A Hybrid Transformer-U-Net with Self-Attention and Explainable Visualizations for Foot Ulcer Segmentation [0.0]
Automated segmentation of diabetic foot ulcers (DFUs) plays a critical role in clinical diagnosis, therapeutic planning, and longitudinal wound monitoring.<n>Traditional convolutional neural networks (CNNs) provide strong localization capabilities but struggle to model long-range spatial dependencies.<n>We propose FUTransUNet, a hybrid architecture that integrates the global attention mechanism of Vision Transformers (ViTs) into the U-Net framework.
arXiv Detail & Related papers (2025-08-04T11:05:14Z) - HANS-Net: Hyperbolic Convolution and Adaptive Temporal Attention for Accurate and Generalizable Liver and Tumor Segmentation in CT Imaging [1.3149714289117207]
Accurate liver and tumor segmentation on abdominal CT images is critical for reliable diagnosis and treatment planning.<n>We introduce Hyperbolic-convolutions Adaptive-temporal-attention with Neural-representation and Synaptic-plasticity Network (HANS-Net)<n>HANS-Net combines hyperbolic convolutions for hierarchical geometric representation, a wavelet-inspired decomposition module for multi-scale texture learning, and an implicit neural representation branch.
arXiv Detail & Related papers (2025-07-15T13:56:37Z) - AMM-Diff: Adaptive Multi-Modality Diffusion Network for Missing Modality Imputation [2.8498944632323755]
In clinical practice, full imaging is not always feasible, often due to complex acquisition protocols, stringent privacy regulations, or specific clinical needs.<n>A promising solution is missing data imputation, where absent modalities are generated from available ones.<n>We propose an Adaptive Multi-Modality Diffusion Network (AMM-Diff), a novel diffusion-based generative model capable of handling any number of input modalities and generating the missing ones.
arXiv Detail & Related papers (2025-01-22T12:29:33Z) - A Feature-Level Ensemble Model for COVID-19 Identification in CXR Images using Choquet Integral and Differential Evolution Optimization [0.7510165488300369]
An effective strategy to mitigate the COVID-19 pandemic involves integrating testing to identify infected individuals.<n>While RT-PCR is considered the gold standard for diagnosing COVID-19, it has some limitations such as the risk of false negatives.<n>This paper introduces a novel Deep Learning Diagnosis System that integrates pre-trained Deep Conal Neural Networks (DCNNs) within an ensemble learning framework.
arXiv Detail & Related papers (2025-01-14T16:28:02Z) - ICFNet: Integrated Cross-modal Fusion Network for Survival Prediction [24.328576712419814]
We propose an Integrated Cross-modal Fusion Network (ICFNet) that integrates histopathology whole slide images, genomic expression profiles, patient demographics, and treatment protocols.<n>ICFNet outperforms state-of-the-art algorithms on five public TCGA datasets.
arXiv Detail & Related papers (2025-01-06T05:49:08Z) - KaLDeX: Kalman Filter based Linear Deformable Cross Attention for Retina Vessel Segmentation [46.57880203321858]
We propose a novel network (KaLDeX) for vascular segmentation leveraging a Kalman filter based linear deformable cross attention (LDCA) module.
Our approach is based on two key components: Kalman filter (KF) based linear deformable convolution (LD) and cross-attention (CA) modules.
The proposed method is evaluated on retinal fundus image datasets (DRIVE, CHASE_BD1, and STARE) as well as the 3mm and 6mm of the OCTA-500 dataset.
arXiv Detail & Related papers (2024-10-28T16:00:42Z) - TotalSegmentator MRI: Robust Sequence-independent Segmentation of Multiple Anatomic Structures in MRI [59.86827659781022]
A nnU-Net model (TotalSegmentator) was trained on MRI and segment 80atomic structures.<n>Dice scores were calculated between the predicted segmentations and expert reference standard segmentations to evaluate model performance.<n>Open-source, easy-to-use model allows for automatic, robust segmentation of 80 structures.
arXiv Detail & Related papers (2024-05-29T20:15:54Z) - Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge [44.76736949127792]
We describe the design and results from the BraTS 2023 Intracranial Meningioma Challenge.<n>The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas.<n>The top ranked team had a lesion-wise median dice similarity coefficient (DSC) of 0.976, 0.976, and 0.964 for enhancing tumor, tumor core, and whole tumor.
arXiv Detail & Related papers (2024-05-16T03:23:57Z) - Leveraging Frequency Domain Learning in 3D Vessel Segmentation [50.54833091336862]
In this study, we leverage Fourier domain learning as a substitute for multi-scale convolutional kernels in 3D hierarchical segmentation models.
We show that our novel network achieves remarkable dice performance (84.37% on ASACA500 and 80.32% on ImageCAS) in tubular vessel segmentation tasks.
arXiv Detail & Related papers (2024-01-11T19:07:58Z) - Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion [56.38386580040991]
Consistency Trajectory Model (CTM) is a generalization of Consistency Models (CM)
CTM enables the efficient combination of adversarial training and denoising score matching loss to enhance performance.
Unlike CM, CTM's access to the score function can streamline the adoption of established controllable/conditional generation methods.
arXiv Detail & Related papers (2023-10-01T05:07:17Z) - UNesT: Local Spatial Representation Learning with Hierarchical
Transformer for Efficient Medical Segmentation [29.287521185541298]
We show that UNesT consistently achieves state-of-the-art performance and evaluate its generalizability and data efficiency.
We show that UNesT consistently achieves state-of-the-art performance and evaluate its generalizability and data efficiency.
arXiv Detail & Related papers (2022-09-28T19:14:38Z) - Efficient Multimodal Transformer with Dual-Level Feature Restoration for
Robust Multimodal Sentiment Analysis [47.29528724322795]
Multimodal Sentiment Analysis (MSA) has attracted increasing attention recently.
Despite significant progress, there are still two major challenges on the way towards robust MSA.
We propose a generic and unified framework to address them, named Efficient Multimodal Transformer with Dual-Level Feature Restoration (EMT-DLFR)
arXiv Detail & Related papers (2022-08-16T08:02:30Z) - A Novel Unified Conditional Score-based Generative Framework for
Multi-modal Medical Image Completion [54.512440195060584]
We propose the Unified Multi-Modal Conditional Score-based Generative Model (UMM-CSGM) to take advantage of Score-based Generative Model (SGM)
UMM-CSGM employs a novel multi-in multi-out Conditional Score Network (mm-CSN) to learn a comprehensive set of cross-modal conditional distributions.
Experiments on BraTS19 dataset show that the UMM-CSGM can more reliably synthesize the heterogeneous enhancement and irregular area in tumor-induced lesions.
arXiv Detail & Related papers (2022-07-07T16:57:21Z) - Vision Transformers for femur fracture classification [59.99241204074268]
The Vision Transformer (ViT) was able to correctly predict 83% of the test images.
Good results were obtained in sub-fractures with the largest and richest dataset ever.
arXiv Detail & Related papers (2021-08-07T10:12:42Z) - Exploration of Interpretability Techniques for Deep COVID-19
Classification using Chest X-ray Images [10.01138352319106]
Five different deep learning models (ResNet18, ResNet34, InceptionV3, InceptionResNetV2, and DenseNet161) and their Ensemble have been used in this paper to classify COVID-19, pneumoniae and healthy subjects using Chest X-Ray images.
The mean Micro-F1 score of the models for COVID-19 classifications ranges from 0.66 to 0.875, and is 0.89 for the Ensemble of the network models.
arXiv Detail & Related papers (2020-06-03T22:55:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.