CSAM: A 2.5D Cross-Slice Attention Module for Anisotropic Volumetric
Medical Image Segmentation
- URL: http://arxiv.org/abs/2311.04942v2
- Date: Mon, 27 Nov 2023 03:12:17 GMT
- Title: CSAM: A 2.5D Cross-Slice Attention Module for Anisotropic Volumetric
Medical Image Segmentation
- Authors: Alex Ling Yu Hung, Haoxin Zheng, Kai Zhao, Xiaoxi Du, Kaifeng Pang, Qi
Miao, Steven S. Raman, Demetri Terzopoulos, Kyunghyun Sung
- Abstract summary: A large portion of volumetric medical data, especially magnetic resonance imaging (MRI) data, is anisotropic.
Both 3D and purely 2D deep learning-based segmentation methods are deficient in dealing with such volumetric data.
We offer a Cross-Slice Attention Module (CSAM) with minimal trainable parameters, which captures information across all the slices in the volume.
- Score: 8.507902378556981
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A large portion of volumetric medical data, especially magnetic resonance
imaging (MRI) data, is anisotropic, as the through-plane resolution is
typically much lower than the in-plane resolution. Both 3D and purely 2D deep
learning-based segmentation methods are deficient in dealing with such
volumetric data since the performance of 3D methods suffers when confronting
anisotropic data, and 2D methods disregard crucial volumetric information.
Insufficient work has been done on 2.5D methods, in which 2D convolution is
mainly used in concert with volumetric information. These models focus on
learning the relationship across slices, but typically have many parameters to
train. We offer a Cross-Slice Attention Module (CSAM) with minimal trainable
parameters, which captures information across all the slices in the volume by
applying semantic, positional, and slice attention on deep feature maps at
different scales. Our extensive experiments using different network
architectures and tasks demonstrate the usefulness and generalizability of
CSAM. Associated code is available at https://github.com/aL3x-O-o-Hung/CSAM.
Related papers
- Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification [69.87877580725768]
Multimodal Visual Surrogate Compression (MVSC) learns to compress and adapt large 3D sMRI volumes into compact 2D features.<n>MVSC has two key components: a Volume Context that captures global cross-slice context under textual guidance, and an Adaptive Slice Fusion module that aggregates slice-level information in a text-enhanced, patch-wise manner.
arXiv Detail & Related papers (2026-01-29T13:05:46Z) - Learning from spatially inhomogenous data: resolution-adaptive convolutions for multiple sclerosis lesion segmentation [32.93762295714261]
In MRI, differences in between vendors, hospitals and sequences can yield highly inhomogeneous imaging data.
For clinical applications, algorithms must be trained to handle data with various voxel resolutions.
We present a network architecture designed to be able to learn directly from spatially heterogeneous data, without resampling.
arXiv Detail & Related papers (2025-03-26T14:07:52Z) - Cross-Dimensional Medical Self-Supervised Representation Learning Based on a Pseudo-3D Transformation [68.60747298865394]
We propose a new cross-dimensional SSL framework based on a pseudo-3D transformation (CDSSL-P3D)
Specifically, we introduce an image transformation based on the im2col algorithm, which converts 2D images into a format consistent with 3D data.
This transformation enables seamless integration of 2D and 3D data, and facilitates cross-dimensional self-supervised learning for 3D medical image analysis.
arXiv Detail & Related papers (2024-06-03T02:57:25Z) - A Flexible 2.5D Medical Image Segmentation Approach with In-Slice and Cross-Slice Attention [13.895277069418045]
We introduce CSA-Net, a flexible 2.5D segmentation model capable of processing 2.5D images with an arbitrary number of slices.
We evaluate CSA-Net on three 2.5D segmentation tasks: (1) brain MRI segmentation, (2) binary prostate MRI segmentation, and (3) multi-class prostate MRI segmentation.
arXiv Detail & Related papers (2024-04-30T18:28:09Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - SAM3D: Segment Anything Model in Volumetric Medical Images [11.764867415789901]
We introduce SAM3D, an innovative adaptation tailored for 3D volumetric medical image analysis.
Unlike current SAM-based methods that segment volumetric data by converting the volume into separate 2D slices for individual analysis, our SAM3D model processes the entire 3D volume image in a unified approach.
arXiv Detail & Related papers (2023-09-07T06:05:28Z) - Joint Self-Supervised Image-Volume Representation Learning with
Intra-Inter Contrastive Clustering [31.52291149830299]
Self-supervised learning can overcome the lack of labeled training samples by learning feature representations from unlabeled data.
Most current SSL techniques in the medical field have been designed for either 2D images or 3D volumes.
We propose a novel framework for unsupervised joint learning on 2D and 3D data modalities.
arXiv Detail & Related papers (2022-12-04T18:57:44Z) - 3-Dimensional Deep Learning with Spatial Erasing for Unsupervised
Anomaly Segmentation in Brain MRI [55.97060983868787]
We investigate whether using increased spatial context by using MRI volumes combined with spatial erasing leads to improved unsupervised anomaly segmentation performance.
We compare 2D variational autoencoder (VAE) to their 3D counterpart, propose 3D input erasing, and systemically study the impact of the data set size on the performance.
Our best performing 3D VAE with input erasing leads to an average DICE score of 31.40% compared to 25.76% for the 2D VAE.
arXiv Detail & Related papers (2021-09-14T09:17:27Z) - PAENet: A Progressive Attention-Enhanced Network for 3D to 2D Retinal
Vessel Segmentation [0.0]
3D to 2D retinal vessel segmentation is a challenging problem in Optical Coherence Tomography Angiography ( OCTA) images.
We propose a Progressive Attention-Enhanced Network (PAENet) based on attention mechanisms to extract rich feature representation.
Our proposed algorithm achieves state-of-the-art performance compared with previous methods.
arXiv Detail & Related papers (2021-08-26T10:27:25Z) - TSGCNet: Discriminative Geometric Feature Learning with Two-Stream
GraphConvolutional Network for 3D Dental Model Segmentation [141.2690520327948]
We propose a two-stream graph convolutional network (TSGCNet) to learn multi-view information from different geometric attributes.
We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners.
arXiv Detail & Related papers (2020-12-26T08:02:56Z) - Fed-Sim: Federated Simulation for Medical Imaging [131.56325440976207]
We introduce a physics-driven generative approach that consists of two learnable neural modules.
We show that our data synthesis framework improves the downstream segmentation performance on several datasets.
arXiv Detail & Related papers (2020-09-01T19:17:46Z) - Uniformizing Techniques to Process CT scans with 3D CNNs for
Tuberculosis Prediction [5.270882613122642]
A common approach to medical image analysis on volumetric data uses deep 2D convolutional neural networks (CNNs)
dealing with the individual slices independently in 2D CNNs deliberately discards the depth information which results in poor performance for the intended task.
We evaluate a set of volume uniformizing methods to address the aforementioned issues.
We report 73% area under curve (AUC) and binary classification accuracy (ACC) of 67.5% on the test set beating all methods which leveraged only image information.
arXiv Detail & Related papers (2020-07-26T21:53:47Z) - 2.75D: Boosting learning by representing 3D Medical imaging to 2D
features for small data [54.223614679807994]
3D convolutional neural networks (CNNs) have started to show superior performance to 2D CNNs in numerous deep learning tasks.
Applying transfer learning on 3D CNN is challenging due to a lack of publicly available pre-trained 3D models.
In this work, we proposed a novel 2D strategical representation of volumetric data, namely 2.75D.
As a result, 2D CNN networks can also be used to learn volumetric information.
arXiv Detail & Related papers (2020-02-11T08:24:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.