Advancing Deformable Medical Image Registration with Multi-axis Cross-covariance Attention
- URL: http://arxiv.org/abs/2412.18545v1
- Date: Tue, 24 Dec 2024 16:52:21 GMT
- Title: Advancing Deformable Medical Image Registration with Multi-axis Cross-covariance Attention
- Authors: Mingyuan Meng, Michael Fulham, Lei Bi, Jinman Kim,
- Abstract summary: We propose an XCA-based transformer block optimized for deformable medical image registration, named Multi-Axis XCA (MAXCA)
MAXCA can capture both global and local long-range dependency among high-resolution image features by applying regional and dilated XCA in parallel via a multi-axis design.
- Score: 9.062010807451388
- License:
- Abstract: Deformable image registration is a fundamental requirement for medical image analysis. Recently, transformers have been widely used in deep learning-based registration methods for their ability to capture long-range dependency via self-attention (SA). However, the high computation and memory loads of SA (growing quadratically with the spatial resolution) hinder transformers from processing subtle textural information in high-resolution image features, e.g., at the full and half image resolutions. This limits deformable registration as the high-resolution textural information is crucial for finding precise pixel-wise correspondence between subtle anatomical structures. Cross-covariance Attention (XCA), as a "transposed" version of SA that operates across feature channels, has complexity growing linearly with the spatial resolution, providing the feasibility of capturing long-range dependency among high-resolution image features. However, existing XCA-based transformers merely capture coarse global long-range dependency, which are unsuitable for deformable image registration relying primarily on fine-grained local correspondence. In this study, we propose to improve existing deep learning-based registration methods by embedding a new XCA mechanism. To this end, we design an XCA-based transformer block optimized for deformable medical image registration, named Multi-Axis XCA (MAXCA). Our MAXCA serves as a general network block that can be embedded into various registration network architectures. It can capture both global and local long-range dependency among high-resolution image features by applying regional and dilated XCA in parallel via a multi-axis design. Extensive experiments on two well-benchmarked inter-/intra-patient registration tasks with seven public medical datasets demonstrate that our MAXCA block enables state-of-the-art registration performance.
Related papers
- TransResNet: Integrating the Strengths of ViTs and CNNs for High Resolution Medical Image Segmentation via Feature Grafting [6.987177704136503]
High-resolution images are preferable in medical imaging domain as they significantly improve the diagnostic capability of the underlying method.
Most of the existing deep learning-based techniques for medical image segmentation are optimized for input images having small spatial dimensions and perform poorly on high-resolution images.
We propose a parallel-in-branch architecture called TransResNet, which incorporates Transformer and CNN in a parallel manner to extract features from multi-resolution images independently.
arXiv Detail & Related papers (2024-10-01T18:22:34Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Scale-aware Super-resolution Network with Dual Affinity Learning for
Lesion Segmentation from Medical Images [50.76668288066681]
We present a scale-aware super-resolution network to adaptively segment lesions of various sizes from low-resolution medical images.
Our proposed network achieved consistent improvements compared to other state-of-the-art methods.
arXiv Detail & Related papers (2023-05-30T14:25:55Z) - Deformable Cross-Attention Transformer for Medical Image Registration [11.498623409184225]
We propose a novel mechanism that computes windowed attention using deformable windows.
The proposed model was extensively evaluated on multi-modal, mono-modal, and atlas-to-patient registration tasks.
arXiv Detail & Related papers (2023-03-10T19:22:01Z) - Affinity Feature Strengthening for Accurate, Complete and Robust Vessel
Segmentation [48.638327652506284]
Vessel segmentation is crucial in many medical image applications, such as detecting coronary stenoses, retinal vessel diseases and brain aneurysms.
We present a novel approach, the affinity feature strengthening network (AFN), which jointly models geometry and refines pixel-wise segmentation features using a contrast-insensitive, multiscale affinity approach.
arXiv Detail & Related papers (2022-11-12T05:39:17Z) - TransAttUnet: Multi-level Attention-guided U-Net with Transformer for
Medical Image Segmentation [33.45471457058221]
This paper proposes a novel Transformer based medical image semantic segmentation framework called TransAttUnet.
In particular, we establish additional multi-scale skip connections between decoder blocks to aggregate the different semantic-scale upsampling features.
Our method consistently outperforms the state-of-the-art baselines.
arXiv Detail & Related papers (2021-07-12T09:17:06Z) - XCiT: Cross-Covariance Image Transformers [73.33400159139708]
We propose a "transposed" version of self-attention that operates across feature channels rather than tokens.
The resulting cross-covariance attention (XCA) has linear complexity in the number of tokens, and allows efficient processing of high-resolution images.
arXiv Detail & Related papers (2021-06-17T17:33:35Z) - LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution
Homography Estimation [52.63874513999119]
Cross-resolution image alignment is a key problem in multiscale giga photography.
Existing deep homography methods neglecting the explicit formulation of correspondences between them, which leads to degraded accuracy in cross-resolution challenges.
We propose a local transformer network embedded within a multiscale structure to explicitly learn correspondences between the multimodal inputs.
arXiv Detail & Related papers (2021-06-08T02:51:45Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.