PULSE: A Unified Multi-Task Architecture for Cardiac Segmentation, Diagnosis, and Few-Shot Cross-Modality Clinical Adaptation
- URL: http://arxiv.org/abs/2512.03848v1
- Date: Wed, 03 Dec 2025 14:49:01 GMT
- Title: PULSE: A Unified Multi-Task Architecture for Cardiac Segmentation, Diagnosis, and Few-Shot Cross-Modality Clinical Adaptation
- Authors: Hania Ghouse, Maryam Alsharqi, Farhad R. Nezami, Muzammil Behzad,
- Abstract summary: We introduce PULSE, a multi-task vision-language framework built on self-supervised representations and optimized through a composite supervision strategy.<n>A multi-scale token reconstruction decoder enables anatomical segmentation, while shared global representations support disease classification and clinically grounded text output.<n>Unlike prior task-specific pipelines, PULSE learns task-invariant cardiac priors, generalizes robustly across datasets, and can be adapted to new imaging modalities with minimal supervision.
- Score: 0.27998963147546135
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cardiac image analysis remains fragmented across tasks: anatomical segmentation, disease classification, and grounded clinical report generation are typically handled by separate networks trained under different data regimes. No existing framework unifies these objectives within a single architecture while retaining generalization across imaging modalities and datasets. We introduce PULSE, a multi-task vision-language framework built on self-supervised representations and optimized through a composite supervision strategy that balances region overlap learning, pixel wise classification fidelity, and boundary aware IoU refinement. A multi-scale token reconstruction decoder enables anatomical segmentation, while shared global representations support disease classification and clinically grounded text output allowing the model to transition from pixels to structures and finally clinical reasoning within one architecture. Unlike prior task-specific pipelines, PULSE learns task-invariant cardiac priors, generalizes robustly across datasets, and can be adapted to new imaging modalities with minimal supervision. This moves the field closer to a scalable, foundation style cardiac analysis framework.
Related papers
- TGC-Net: A Structure-Aware and Semantically-Aligned Framework for Text-Guided Medical Image Segmentation [56.09179939570486]
We propose TGC-Net, a CLIP-based framework focusing on parameter-efficient, task-specific adaptations.<n>TGC-Net achieves state-of-the-art performance with substantially fewer trainable parameters, including notable Dice gains on challenging benchmarks.
arXiv Detail & Related papers (2025-12-24T12:06:26Z) - Integrating Multi-scale and Multi-filtration Topological Features for Medical Image Classification [20.820287362872975]
Deep neural networks have shown remarkable performance in medical image classification.<n>We propose a new topology-guided classification framework that extracts multi-scale and multi-filtration persistent topological features.<n>Our approach enhances the model's capacity to recognize complex anatomical structures.
arXiv Detail & Related papers (2025-12-08T06:02:02Z) - DTEA: Dynamic Topology Weaving and Instability-Driven Entropic Attenuation for Medical Image Segmentation [31.50032207382483]
skip connections are used to merge global context and reduce the semantic gap between encoder and decoder.<n>We propose the DTEA model, featuring a new skip connection framework with the Semantic Topology Reconfiguration (STR) and Entropic Perturbation Gating (EPG) modules.
arXiv Detail & Related papers (2025-10-13T10:50:41Z) - Self-Supervised Anatomical Consistency Learning for Vision-Grounded Medical Report Generation [61.350584471060756]
Vision-grounded medical report generation aims to produce clinically accurate descriptions of medical images.<n>We propose Self-Supervised Anatomical Consistency Learning (SS-ACL) to align generated reports with corresponding anatomical regions.<n>SS-ACL constructs a hierarchical anatomical graph inspired by the invariant top-down inclusion structure of human anatomy.
arXiv Detail & Related papers (2025-09-30T08:59:06Z) - CLAPS: A CLIP-Unified Auto-Prompt Segmentation for Multi-Modal Retinal Imaging [47.04292769940597]
We propose CLIP-unified Auto-Prompt (CLAPS), a novel method for unified segmentation across diverse tasks and modalities in retinal imaging.<n>Our approach begins by pre-training a CLIP-based image encoder on a large, multi-modal retinal dataset.<n>To unify tasks and resolve ambiguity, we use text prompts enhanced with a unique "modality signature" for each imaging modality.
arXiv Detail & Related papers (2025-09-10T14:14:49Z) - Multimodal Causal-Driven Representation Learning for Generalizable Medical Image Segmentation [56.52520416420957]
We propose Multimodal Causal-Driven Representation Learning (MCDRL) to tackle domain generalization in medical image segmentation.<n>MCDRL consistently outperforms competing methods, yielding superior segmentation accuracy and exhibiting robust generalizability.
arXiv Detail & Related papers (2025-08-07T03:41:41Z) - Foundation Model for Whole-Heart Segmentation: Leveraging Student-Teacher Learning in Multi-Modal Medical Imaging [0.510750648708198]
Whole-heart segmentation from CT and MRI scans is crucial for cardiovascular disease analysis.<n>Existing methods struggle with modality-specific biases and the need for extensive labeled datasets.<n>We propose a foundation model for whole-heart segmentation using a self-supervised learning framework based on a student-teacher architecture.
arXiv Detail & Related papers (2025-03-24T14:47:54Z) - Language Guided Domain Generalized Medical Image Segmentation [68.93124785575739]
Single source domain generalization holds promise for more reliable and consistent image segmentation across real-world clinical settings.
We propose an approach that explicitly leverages textual information by incorporating a contrastive learning mechanism guided by the text encoder features.
Our approach achieves favorable performance against existing methods in literature.
arXiv Detail & Related papers (2024-04-01T17:48:15Z) - Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior [34.54360931760496]
Key anatomical features, such as the number of organs, their shapes and relative positions, are crucial for building a robust multi-organ segmentation model.
We introduce a novel architecture called the Anatomy-Informed Network (AIC-Net)
AIC-Net incorporates a learnable input termed "Anatomical Prior", which can be adapted to patient-specific anatomy.
arXiv Detail & Related papers (2024-03-27T10:46:24Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Cross-level Contrastive Learning and Consistency Constraint for
Semi-supervised Medical Image Segmentation [46.678279106837294]
We propose a cross-level constrastive learning scheme to enhance representation capacity for local features in semi-supervised medical image segmentation.
With the help of the cross-level contrastive learning and consistency constraint, the unlabelled data can be effectively explored to improve segmentation performance.
arXiv Detail & Related papers (2022-02-08T15:12:11Z) - Spatially Dependent U-Nets: Highly Accurate Architectures for Medical
Imaging Segmentation [10.77039660100327]
We introduce a novel deep neural network architecture that exploits the inherent spatial coherence of anatomical structures.
Our approach is well equipped to capture long-range spatial dependencies in the segmented pixel/voxel space.
Our method performs favourably to commonly used U-Net and U-Net++ architectures.
arXiv Detail & Related papers (2021-03-22T10:37:20Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z) - Studying Robustness of Semantic Segmentation under Domain Shift in
cardiac MRI [0.8858288982748155]
We study challenges and opportunities of domain transfer across images from multiple clinical centres and scanner vendors.
In this work, we build upon a fixed U-Net architecture configured by the nnU-net framework to investigate various data augmentation techniques and batch normalization layers.
arXiv Detail & Related papers (2020-11-15T17:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.