Related papers: Cross-modal Fundus Image Registration under Large FoV Disparity

Cross-modal Fundus Image Registration under Large FoV Disparity

URL: http://arxiv.org/abs/2512.12657v1
Date: Sun, 14 Dec 2025 12:10:37 GMT
Title: Cross-modal Fundus Image Registration under Large FoV Disparity
Authors: Hongyang Li, Junyi Tao, Qijie Wei, Ningzhi Yang, Meng Wang, Weihong Yu, Xirong Li,
Abstract summary: Previous work on cross-modal fundus image registration (CMFIR) assumes small cross-modal Field-of-View (FoV) disparity.<n>This paper is targeted at a more challenging scenario with large FoV disparity, to which directly applying current methods fails.<n>We propose Crop and Alignment for cross-modal fundus image Registration(CARe), a very simple yet effective method.
Score: 18.69938492903591
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Previous work on cross-modal fundus image registration (CMFIR) assumes small cross-modal Field-of-View (FoV) disparity. By contrast, this paper is targeted at a more challenging scenario with large FoV disparity, to which directly applying current methods fails. We propose Crop and Alignment for cross-modal fundus image Registration(CARe), a very simple yet effective method. Specifically, given an OCTA with smaller FoV as a source image and a wide-field color fundus photograph (wfCFP) as a target image, our Crop operation exploits the physiological structure of the retina to crop from the target image a sub-image with its FoV roughly aligned with that of the source. This operation allows us to re-purpose the previous small-FoV-disparity oriented methods for subsequent image registration. Moreover, we improve spatial transformation by a double-fitting based Alignment module that utilizes the classical RANSAC algorithm and polynomial-based coordinate fitting in a sequential manner. Extensive experiments on a newly developed test set of 60 OCTA-wfCFP pairs verify the viability of CARe for CMFIR.

Related papers

Feature Alignment with Equivariant Convolutions for Burst Image Super-Resolution [52.55429225242423]
We propose a novel framework for Burst Image Super-Resolution (BISR), featuring an equivariant convolution-based alignment.<n>This enables the alignment transformation to be learned via explicit supervision in the image domain and easily applied in the feature domain.<n>Experiments on BISR benchmarks show the superior performance of our approach in both quantitative metrics and visual quality.
arXiv Detail & Related papers (2025-03-11T11:13:10Z)
Mind the Gap Between Prototypes and Images in Cross-domain Finetuning [64.97317635355124]
We propose a contrastive prototype-image adaptation (CoPA) to adapt different transformations respectively for prototypes and images. Experiments on Meta-Dataset demonstrate that CoPA achieves the state-of-the-art performance more efficiently.
arXiv Detail & Related papers (2024-10-16T11:42:11Z)
FunOTTA: On-the-Fly Adaptation on Cross-Domain Fundus Image via Stable Test-time Training [40.728092407170756]
We propose a novel Fundus On-the-fly Test-Time Adaptation (FunOTTA) framework that effectively generalizes a fundus image diagnosis model to unseen environments.<n>FunOTTA stands out for its stable adaptation process by performing dynamic disambiguation in the memory bank while minimizing harmful prior knowledge bias.<n> Experiments on cross-domain fundus image benchmarks across two diseases demonstrate the superiority of the overall framework.
arXiv Detail & Related papers (2024-07-05T10:06:55Z)
CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition. Our method uses the attention mechanism to correlate multiple images within a batch. Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z)
Novel OCT mosaicking pipeline with Feature- and Pixel-based registration [8.22581088888652]
We propose a versatile pipeline for stitching multi-view OCT/ OCTA textiten face projection images. Our method combines the strengths of learning-based feature matching and robust pixel-based registration to align multiple images effectively. The efficacy of our pipeline is validated using an in-house dataset and a large public dataset.
arXiv Detail & Related papers (2023-11-21T23:25:04Z)
Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration [67.23451452670282]
Misalignments between multi-modality images pose challenges in image fusion. We propose a Cross-modality Multi-scale Progressive Dense Registration scheme. This scheme accomplishes the coarse-to-fine registration exclusively using a one-stage optimization.
arXiv Detail & Related papers (2023-08-22T03:46:24Z)
Supervised Domain Adaptation for Recognizing Retinal Diseases from Wide-Field Fundus Images [23.503104144297684]
This paper addresses the emerging task of recognizing multiple retinal diseases from wide-field (WF) and ultra-wide-field (UWF) fundus images. We propose a supervised domain adaptation method named Cross-domain Collaborative Learning (CdCL) Inspired by the success of fixed-based mixup in unsupervised domain adaptation, we re-purpose this strategy for the current task.
arXiv Detail & Related papers (2023-05-14T05:57:11Z)
PIT: Position-Invariant Transform for Cross-FoV Domain Adaptation [53.428312630479816]
We observe that the Field of View (FoV) gap induces noticeable instance appearance differences between the source and target domains. Motivated by the observations, we propose the textbfPosition-Invariant Transform (PIT) to better align images in different domains.
arXiv Detail & Related papers (2021-08-16T15:16:47Z)
TFill: Image Completion via a Transformer-Based Architecture [69.62228639870114]
We propose treating image completion as a directionless sequence-to-sequence prediction task. We employ a restrictive CNN with small and non-overlapping RF for token representation. In a second phase, to improve appearance consistency between visible and generated regions, a novel attention-aware layer (AAL) is introduced.
arXiv Detail & Related papers (2021-04-02T01:42:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.