SPIRONet: Spatial-Frequency Learning and Topological Channel Interaction Network for Vessel Segmentation
- URL: http://arxiv.org/abs/2406.19749v1
- Date: Fri, 28 Jun 2024 08:48:14 GMT
- Title: SPIRONet: Spatial-Frequency Learning and Topological Channel Interaction Network for Vessel Segmentation
- Authors: De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Bo-Xian Yao, Zeng-Guang Hou,
- Abstract summary: A novel spatial-frequency learning and topological channel interaction network (SPIRONet) is proposed to address the above issues.
Dual encoders are utilized to comprehensively capture local spatial and global frequency vessel features.
Cross-attention fusion module is introduced to effectively fuse spatial and frequency features.
A topological channel interaction module is designed to filter out task-irrelevant responses based on graph neural networks.
- Score: 14.684277591969392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic vessel segmentation is paramount for developing next-generation interventional navigation systems. However, current approaches suffer from suboptimal segmentation performances due to significant challenges in intraoperative images (i.e., low signal-to-noise ratio, small or slender vessels, and strong interference). In this paper, a novel spatial-frequency learning and topological channel interaction network (SPIRONet) is proposed to address the above issues. Specifically, dual encoders are utilized to comprehensively capture local spatial and global frequency vessel features. Then, a cross-attention fusion module is introduced to effectively fuse spatial and frequency features, thereby enhancing feature discriminability. Furthermore, a topological channel interaction module is designed to filter out task-irrelevant responses based on graph neural networks. Extensive experimental results on several challenging datasets (CADSA, CAXF, DCA1, and XCAD) demonstrate state-of-the-art performances of our method. Moreover, the inference speed of SPIRONet is 21 FPS with a 512x512 input size, surpassing clinical real-time requirements (6~12FPS). These promising outcomes indicate SPIRONet's potential for integration into vascular interventional navigation systems. Code is available at https://github.com/Dxhuang-CASIA/SPIRONet.
Related papers
- UniPhyNet: A Unified Network For Multimodal Physiological Raw Signal Classification [0.18416014644193066]
We present UniPhyNet, a novel neural network architecture to classify cognitive load using multimodal physiological data.<n>On the CL-Drive dataset, UniPhyNet improves raw signal classification accuracy from 70% to 80% (binary) and 62% to 74% (ternary)
arXiv Detail & Related papers (2025-07-08T07:13:45Z) - Dynamic Cross-Modal Feature Interaction Network for Hyperspectral and LiDAR Data Classification [66.59320112015556]
Hyperspectral image (HSI) and LiDAR data joint classification is a challenging task.
We propose a novel Dynamic Cross-Modal Feature Interaction Network (DCMNet)
Our approach introduces three feature interaction blocks: Bilinear Spatial Attention Block (BSAB), Bilinear Channel Attention Block (BCAB), and Integration Convolutional Block (ICB)
arXiv Detail & Related papers (2025-03-10T05:50:13Z) - FE-UNet: Frequency Domain Enhanced U-Net for Low-Frequency Information-Rich Image Segmentation [48.034848981295525]
We address the differences in frequency band sensitivity between CNNs and the human visual system.<n>We propose a wavelet adaptive spectrum fusion (WASF) method inspired by biological vision mechanisms to balance cross-frequency image features.<n>We develop the FE-UNet model, which employs a SAM2 backbone network and incorporates fine-tuned Hiera-Large modules to ensure segmentation accuracy.
arXiv Detail & Related papers (2025-02-06T07:24:34Z) - QTSeg: A Query Token-Based Dual-Mix Attention Framework with Multi-Level Feature Distribution for Medical Image Segmentation [13.359001333361272]
Medical image segmentation plays a crucial role in assisting healthcare professionals with accurate diagnoses and enabling automated diagnostic processes.
Traditional convolutional neural networks (CNNs) often struggle with capturing long-range dependencies, while transformer-based architectures come with increased computational complexity.
Recent efforts have focused on combining CNNs and transformers to balance performance and efficiency, but existing approaches still face challenges in achieving high segmentation accuracy while maintaining low computational costs.
We propose QTSeg, a novel architecture for medical image segmentation that effectively integrates local and global information.
arXiv Detail & Related papers (2024-12-23T03:22:44Z) - Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
In neuromorphic computing, spiking neural networks (SNNs) perform inference tasks, offering significant efficiency gains for workloads involving sequential data.
Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy.
This paper investigates a wireless neuromorphic split computing architecture employing multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z) - Signal-SGN: A Spiking Graph Convolutional Network for Skeletal Action Recognition via Learning Temporal-Frequency Dynamics [2.9578022754506605]
In skeletal-based action recognition, Graph Convolutional Networks (GCNs) face limitations due to their complexity and high energy consumption.
We propose a Signal-SGN(Spiking Graph Convolutional Network), which leverages the temporal dimension of skeletal sequences as the spiking timestep.
Our experiments show that the proposed models not only surpass existing SNN-based methods in accuracy but also reduce computational storage costs during training.
arXiv Detail & Related papers (2024-08-03T07:47:16Z) - Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data.
We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising.
Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z) - Joint Channel Estimation and Feedback with Masked Token Transformers in
Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix.
The entire encoder-decoder network is utilized for channel compression.
Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z) - RFC-Net: Learning High Resolution Global Features for Medical Image
Segmentation on a Computational Budget [4.712700480142554]
We propose Receptive Field Chain Network (RFC-Net) that learns high resolution global features on a compressed computational space.
Our experiments demonstrate that RFC-Net achieves state-of-the-art performance on Kvasir and CVC-ClinicDB benchmarks for Polyp segmentation.
arXiv Detail & Related papers (2023-02-13T06:52:47Z) - Video-TransUNet: Temporally Blended Vision Transformer for CT VFSS
Instance Segmentation [11.575821326313607]
We propose Video-TransUNet, a deep architecture for segmentation in medical CT videos constructed by integrating temporal feature blending into the TransUNet deep learning framework.
In particular, our approach amalgamates strong frame representation via a ResNet CNN backbone, multi-frame feature blending via a Temporal Context Module, and reconstructive capabilities for multiple targets via a UNet-based convolutional-deconal architecture with multiple heads.
arXiv Detail & Related papers (2022-08-17T14:28:58Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z) - Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral
Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI.
Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z) - Instance Explainable Temporal Network For Multivariate Timeseries [0.0]
We propose a novel network (IETNet) that identifies the important channels in the classification decision for each instance of inference.
IETNet is an end-to-end network that combines temporal feature extraction, variable selection, and joint variable interaction into a single learning framework.
arXiv Detail & Related papers (2020-05-26T20:55:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.