Cross-modal Fundus Image Registration under Large FoV Disparity
- URL: http://arxiv.org/abs/2512.12657v1
- Date: Sun, 14 Dec 2025 12:10:37 GMT
- Title: Cross-modal Fundus Image Registration under Large FoV Disparity
- Authors: Hongyang Li, Junyi Tao, Qijie Wei, Ningzhi Yang, Meng Wang, Weihong Yu, Xirong Li,
- Abstract summary: Previous work on cross-modal fundus image registration (CMFIR) assumes small cross-modal Field-of-View (FoV) disparity.<n>This paper is targeted at a more challenging scenario with large FoV disparity, to which directly applying current methods fails.<n>We propose Crop and Alignment for cross-modal fundus image Registration(CARe), a very simple yet effective method.
- Score: 18.69938492903591
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous work on cross-modal fundus image registration (CMFIR) assumes small cross-modal Field-of-View (FoV) disparity. By contrast, this paper is targeted at a more challenging scenario with large FoV disparity, to which directly applying current methods fails. We propose Crop and Alignment for cross-modal fundus image Registration(CARe), a very simple yet effective method. Specifically, given an OCTA with smaller FoV as a source image and a wide-field color fundus photograph (wfCFP) as a target image, our Crop operation exploits the physiological structure of the retina to crop from the target image a sub-image with its FoV roughly aligned with that of the source. This operation allows us to re-purpose the previous small-FoV-disparity oriented methods for subsequent image registration. Moreover, we improve spatial transformation by a double-fitting based Alignment module that utilizes the classical RANSAC algorithm and polynomial-based coordinate fitting in a sequential manner. Extensive experiments on a newly developed test set of 60 OCTA-wfCFP pairs verify the viability of CARe for CMFIR.
Related papers
- Feature Alignment with Equivariant Convolutions for Burst Image Super-Resolution [52.55429225242423]
We propose a novel framework for Burst Image Super-Resolution (BISR), featuring an equivariant convolution-based alignment.<n>This enables the alignment transformation to be learned via explicit supervision in the image domain and easily applied in the feature domain.<n>Experiments on BISR benchmarks show the superior performance of our approach in both quantitative metrics and visual quality.
arXiv Detail & Related papers (2025-03-11T11:13:10Z) - Mind the Gap Between Prototypes and Images in Cross-domain Finetuning [64.97317635355124]
We propose a contrastive prototype-image adaptation (CoPA) to adapt different transformations respectively for prototypes and images.
Experiments on Meta-Dataset demonstrate that CoPA achieves the state-of-the-art performance more efficiently.
arXiv Detail & Related papers (2024-10-16T11:42:11Z) - FunOTTA: On-the-Fly Adaptation on Cross-Domain Fundus Image via Stable Test-time Training [40.728092407170756]
We propose a novel Fundus On-the-fly Test-Time Adaptation (FunOTTA) framework that effectively generalizes a fundus image diagnosis model to unseen environments.<n>FunOTTA stands out for its stable adaptation process by performing dynamic disambiguation in the memory bank while minimizing harmful prior knowledge bias.<n> Experiments on cross-domain fundus image benchmarks across two diseases demonstrate the superiority of the overall framework.
arXiv Detail & Related papers (2024-07-05T10:06:55Z) - CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition.
Our method uses the attention mechanism to correlate multiple images within a batch.
Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z) - Novel OCT mosaicking pipeline with Feature- and Pixel-based registration [8.22581088888652]
We propose a versatile pipeline for stitching multi-view OCT/ OCTA textiten face projection images.
Our method combines the strengths of learning-based feature matching and robust pixel-based registration to align multiple images effectively.
The efficacy of our pipeline is validated using an in-house dataset and a large public dataset.
arXiv Detail & Related papers (2023-11-21T23:25:04Z) - Improving Misaligned Multi-modality Image Fusion with One-stage
Progressive Dense Registration [67.23451452670282]
Misalignments between multi-modality images pose challenges in image fusion.
We propose a Cross-modality Multi-scale Progressive Dense Registration scheme.
This scheme accomplishes the coarse-to-fine registration exclusively using a one-stage optimization.
arXiv Detail & Related papers (2023-08-22T03:46:24Z) - Supervised Domain Adaptation for Recognizing Retinal Diseases from
Wide-Field Fundus Images [23.503104144297684]
This paper addresses the emerging task of recognizing multiple retinal diseases from wide-field (WF) and ultra-wide-field (UWF) fundus images.
We propose a supervised domain adaptation method named Cross-domain Collaborative Learning (CdCL)
Inspired by the success of fixed-based mixup in unsupervised domain adaptation, we re-purpose this strategy for the current task.
arXiv Detail & Related papers (2023-05-14T05:57:11Z) - PIT: Position-Invariant Transform for Cross-FoV Domain Adaptation [53.428312630479816]
We observe that the Field of View (FoV) gap induces noticeable instance appearance differences between the source and target domains.
Motivated by the observations, we propose the textbfPosition-Invariant Transform (PIT) to better align images in different domains.
arXiv Detail & Related papers (2021-08-16T15:16:47Z) - TFill: Image Completion via a Transformer-Based Architecture [69.62228639870114]
We propose treating image completion as a directionless sequence-to-sequence prediction task.
We employ a restrictive CNN with small and non-overlapping RF for token representation.
In a second phase, to improve appearance consistency between visible and generated regions, a novel attention-aware layer (AAL) is introduced.
arXiv Detail & Related papers (2021-04-02T01:42:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.