Related papers: Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion

Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion

URL: http://arxiv.org/abs/2507.12938v1
Date: Thu, 17 Jul 2025 09:25:00 GMT
Title: Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion
Authors: Caixia Dong, Duwei Dai, Xinyi Han, Fan Liu, Xu Yang, Zongfang Li, Songhua Xu,
Abstract summary: coronary artery segmentation is critical for computeraided diagnosis of coronary artery disease (CAD)<n>We propose a novel framework that leverages the power of vision foundation models (VFMs) through a parallel encoding architecture.<n>The proposed framework significantly outperforms state-of-the-art methods, achieving superior performance in accurate coronary artery segmentation.
Score: 12.839049648094893
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate coronary artery segmentation is critical for computeraided diagnosis of coronary artery disease (CAD), yet it remains challenging due to the small size, complex morphology, and low contrast with surrounding tissues. To address these challenges, we propose a novel segmentation framework that leverages the power of vision foundation models (VFMs) through a parallel encoding architecture. Specifically, a vision transformer (ViT) encoder within the VFM captures global structural features, enhanced by the activation of the final two ViT blocks and the integration of an attention-guided enhancement (AGE) module, while a convolutional neural network (CNN) encoder extracts local details. These complementary features are adaptively fused using a cross-branch variational fusion (CVF) module, which models latent distributions and applies variational attention to assign modality-specific weights. Additionally, we introduce an evidential-learning uncertainty refinement (EUR) module, which quantifies uncertainty using evidence theory and refines uncertain regions by incorporating multi-scale feature aggregation and attention mechanisms, further enhancing segmentation accuracy. Extensive evaluations on one in-house and two public datasets demonstrate that the proposed framework significantly outperforms state-of-the-art methods, achieving superior performance in accurate coronary artery segmentation and showcasing strong generalization across multiple datasets. The code is available at https://github.com/d1c2x3/CAseg.

Related papers

TGC-Net: A Structure-Aware and Semantically-Aligned Framework for Text-Guided Medical Image Segmentation [56.09179939570486]
We propose TGC-Net, a CLIP-based framework focusing on parameter-efficient, task-specific adaptations.<n>TGC-Net achieves state-of-the-art performance with substantially fewer trainable parameters, including notable Dice gains on challenging benchmarks.
arXiv Detail & Related papers (2025-12-24T12:06:26Z)
DTEA: Dynamic Topology Weaving and Instability-Driven Entropic Attenuation for Medical Image Segmentation [31.50032207382483]
skip connections are used to merge global context and reduce the semantic gap between encoder and decoder.<n>We propose the DTEA model, featuring a new skip connection framework with the Semantic Topology Reconfiguration (STR) and Entropic Perturbation Gating (EPG) modules.
arXiv Detail & Related papers (2025-10-13T10:50:41Z)
Large Language Model Evaluated Stand-alone Attention-Assisted Graph Neural Network with Spatial and Structural Information Interaction for Precise Endoscopic Image Segmentation [16.773882069530426]
We propose FOCUS-Med, which stands for Fusion of spatial and structural graph with attentional context-aware polyp segmentation.<n> FOCUS-Med integrates a Dual Graph Convolutional Network (Dual-GCN) module to capture contextual spatial and topological structural dependencies.<n>Experiments on public benchmarks demonstrate that FOCUS-Med achieves state-of-the-art performance across five key metrics.
arXiv Detail & Related papers (2025-08-09T15:53:19Z)
CENet: Context Enhancement Network for Medical Image Segmentation [3.4690322157094573]
We propose the Context Enhancement Network (CENet), a novel segmentation framework featuring two key innovations.<n>First, the Dual Selective Enhancement Block (DSEB) integrated into skip connections enhances boundary details and improves the detection of smaller organs in a context-aware manner.<n>Second, the Context Feature Attention Module (CFAM) in the decoder employs a multi-scale design to maintain spatial integrity, reduce feature redundancy, and mitigate overly enhanced representations.
arXiv Detail & Related papers (2025-05-23T23:22:18Z)
Channel Fingerprint Construction for Massive MIMO: A Deep Conditional Generative Approach [65.47969413708344]
We introduce the concept of CF twins and design a conditional generative diffusion model (CGDM)<n>We employ a variational inference technique to derive the evidence lower bound (ELBO) for the log-marginal distribution of the observed fine-grained CF conditioned on the coarse-grained CF.<n>We show that the proposed approach exhibits significant improvement in reconstruction performance compared to the baselines.
arXiv Detail & Related papers (2025-05-12T01:36:06Z)
Semi-supervised Semantic Segmentation with Multi-Constraint Consistency Learning [81.02648336552421]
We propose a Multi-Constraint Consistency Learning approach to facilitate the staged enhancement of the encoder and decoder.<n>Self-adaptive feature masking and noise injection are designed in an instance-specific manner to perturb the features for robust learning of the decoder.<n> Experimental results on Pascal VOC2012 and Cityscapes datasets demonstrate that our proposed MCCL achieves new state-of-the-art performance.
arXiv Detail & Related papers (2025-03-23T03:21:33Z)
MSV-Mamba: A Multiscale Vision Mamba Network for Echocardiography Segmentation [8.090155401012169]
Mamba, an emerging model, is one of the most cutting-edge approaches that is widely applied to diverse vision and language tasks.<n>This paper introduces a U-shaped deep learning model incorporating a large-window multiscale mamba module and a hierarchical feature fusion approach for echocardiographic segmentation.
arXiv Detail & Related papers (2025-01-13T08:22:10Z)
FIAS: Feature Imbalance-Aware Medical Image Segmentation with Dynamic Fusion and Mixing Attention [11.385231493066312]
hybrid architecture that combine convolutional neural networks (CNNs) and transformers demonstrates competitive ability in medical image segmentation.<n>We propose a Feaure Imbalance-Aware (FIAS) network, which incorporates a dual-path encoder and a novel Mixing Attention (MixAtt) decoder.
arXiv Detail & Related papers (2024-11-16T20:30:44Z)
Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI [58.809276442508256]
We propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers. The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network superior performance than the state-of-the-art methods.
arXiv Detail & Related papers (2024-08-11T15:46:00Z)
Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model [6.392575673488379]
We introduce Swin-Res-Net, a specialized module designed to enhance the precision of retinal vessel segmentation. Swin-Res-Net utilizes the Swin transformer which uses shifted windows with displacement for partitioning. Our proposed architecture produces outstanding results, either meeting or surpassing those of other published models.
arXiv Detail & Related papers (2024-03-03T01:36:11Z)
DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and Authentication [50.017055360261665]
We introduce DiffVein, a unified diffusion model-based framework which simultaneously addresses vein segmentation and authentication tasks. For better feature interaction between these two branches, we introduce two specialized modules. In this way, our framework allows for a dynamic interplay between diffusion and segmentation embeddings.
arXiv Detail & Related papers (2024-02-03T06:49:42Z)
Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis. We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z)
Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network. We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module. Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z)
RetiFluidNet: A Self-Adaptive and Multi-Attention Deep Convolutional Network for Retinal OCT Fluid Segmentation [3.57686754209902]
Quantification of retinal fluids is necessary for OCT-guided treatment management. New convolutional neural architecture named RetiFluidNet is proposed for multi-class retinal fluid segmentation. Model benefits from hierarchical representation learning of textural, contextual, and edge features.
arXiv Detail & Related papers (2022-09-26T07:18:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.