Vector Field Attention for Deformable Image Registration
- URL: http://arxiv.org/abs/2407.10209v1
- Date: Sun, 14 Jul 2024 14:06:58 GMT
- Title: Vector Field Attention for Deformable Image Registration
- Authors: Yihao Liu, Junyu Chen, Lianrui Zuo, Aaron Carass, Jerry L. Prince,
- Abstract summary: Deformable image registration establishes non-linear spatial correspondences between fixed and moving images.
Most existing deep learning-based methods require neural networks to encode location information in their feature maps.
We present Vector Field Attention (VFA), a novel framework that enhances the efficiency of the existing network design by enabling direct retrieval of location correspondences.
- Score: 9.852055065890479
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deformable image registration establishes non-linear spatial correspondences between fixed and moving images. Deep learning-based deformable registration methods have been widely studied in recent years due to their speed advantage over traditional algorithms as well as their better accuracy. Most existing deep learning-based methods require neural networks to encode location information in their feature maps and predict displacement or deformation fields though convolutional or fully connected layers from these high-dimensional feature maps. In this work, we present Vector Field Attention (VFA), a novel framework that enhances the efficiency of the existing network design by enabling direct retrieval of location correspondences. VFA uses neural networks to extract multi-resolution feature maps from the fixed and moving images and then retrieves pixel-level correspondences based on feature similarity. The retrieval is achieved with a novel attention module without the need of learnable parameters. VFA is trained end-to-end in either a supervised or unsupervised manner. We evaluated VFA for intra- and inter-modality registration and for unsupervised and semi-supervised registration using public datasets, and we also evaluated it on the Learn2Reg challenge. Experimental results demonstrate the superior performance of VFA compared to existing methods. The source code of VFA is publicly available at https://github.com/yihao6/vfa/.
Related papers
- CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition.
Our method uses the attention mechanism to correlate multiple images within a batch.
Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z) - Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing [77.34726150561087]
This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs.
All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
arXiv Detail & Related papers (2022-11-02T11:10:42Z) - Self-Supervised Place Recognition by Refining Temporal and Featural Pseudo Labels from Panoramic Data [16.540900776820084]
We propose a novel framework named TF-VPR that uses temporal neighborhoods and learnable feature neighborhoods to discover unknown spatial neighborhoods.
Our method outperforms self-supervised baselines in recall rate, robustness, and heading diversity.
arXiv Detail & Related papers (2022-08-19T12:59:46Z) - Non-iterative Coarse-to-fine Registration based on Single-pass Deep
Cumulative Learning [11.795108660250843]
We propose a Non-Iterative Coarse-to-finE registration network (NICE-Net) for deformable image registration.
NICE-Net can outperform state-of-the-art iterative deep registration methods while only requiring similar runtime to non-iterative methods.
arXiv Detail & Related papers (2022-06-25T08:34:59Z) - BatchFormerV2: Exploring Sample Relationships for Dense Representation
Learning [88.82371069668147]
BatchFormerV2 is a more general batch Transformer module, which enables exploring sample relationships for dense representation learning.
BatchFormerV2 consistently improves current DETR-based detection methods by over 1.3%.
arXiv Detail & Related papers (2022-04-04T05:53:42Z) - Affine Medical Image Registration with Coarse-to-Fine Vision Transformer [11.4219428942199]
We present a learning-based algorithm, Coarse-to-Fine Vision Transformer (C2FViT), for 3D affine medical image registration.
Our method is superior to the existing CNNs-based affine registration methods in terms of registration accuracy, robustness and generalizability.
arXiv Detail & Related papers (2022-03-29T03:18:43Z) - CutPaste: Self-Supervised Learning for Anomaly Detection and
Localization [59.719925639875036]
We propose a framework for building anomaly detectors using normal training data only.
We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations.
Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects.
arXiv Detail & Related papers (2021-04-08T19:04:55Z) - STA-VPR: Spatio-temporal Alignment for Visual Place Recognition [17.212503755962757]
We propose an adaptive dynamic time warping algorithm to align local features from the spatial domain while measuring the distance between two images.
A local matching DTW algorithm is applied to perform image sequence matching based on temporal alignment.
The results show that the proposed method significantly improves the CNN-based methods.
arXiv Detail & Related papers (2021-03-25T03:27:42Z) - Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets)
Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network"
Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.