Learning multi-domain feature relation for visible and Long-wave
Infrared image patch matching
- URL: http://arxiv.org/abs/2308.04880v1
- Date: Wed, 9 Aug 2023 11:23:32 GMT
- Title: Learning multi-domain feature relation for visible and Long-wave
Infrared image patch matching
- Authors: Xiuwei Zhang, Yanping Li, Zhaoshuai Qi, Yi Sun, Yanning Zhang
- Abstract summary: We present the largest visible and Long-wave Infrared (LWIR) image patch matching dataset, termed VL-CMIM.
In addition, a multi-domain feature relation learning network (MD-FRN) is proposed.
- Score: 39.88037892637296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, learning-based algorithms have achieved promising performance on
cross-spectral image patch matching, which, however, is still far from
satisfactory for practical application. On the one hand, a lack of large-scale
dataset with diverse scenes haunts its further improvement for learning-based
algorithms, whose performances and generalization rely heavily on the dataset
size and diversity. On the other hand, more emphasis has been put on feature
relation in the spatial domain whereas the scale dependency between features
has often been ignored, leading to performance degeneration especially when
encountering significant appearance variations for cross-spectral patches. To
address these issues, we publish, to be best of our knowledge, the largest
visible and Long-wave Infrared (LWIR) image patch matching dataset, termed
VL-CMIM, which contains 1300 pairs of strictly aligned visible and LWIR images
and over 2 million patch pairs covering diverse scenes such as asteroid, field,
country, build, street and water.In addition, a multi-domain feature relation
learning network (MD-FRN) is proposed. Input by the features extracted from a
four-branch network, both feature relations in spatial and scale domains are
learned via a spatial correlation module (SCM) and multi-scale adaptive
aggregation module (MSAG), respectively. To further aggregate the multi-domain
relations, a deep domain interactive mechanism (DIM) is applied, where the
learnt spatial-relation and scale-relation features are exchanged and further
input into MSCRM and SCM. This mechanism allows our model to learn interactive
cross-domain feature relations, leading to improved robustness to significant
appearance changes due to different modality.
Related papers
- Frequency-Spatial Entanglement Learning for Camouflaged Object Detection [34.426297468968485]
Existing methods attempt to reduce the impact of pixel similarity by maximizing the distinguishing ability of spatial features with complicated design.
We propose a new approach to address this issue by jointly exploring the representation in the frequency and spatial domains, introducing the Frequency-Spatial Entanglement Learning (FSEL) method.
Our experiments demonstrate the superiority of our FSEL over 21 state-of-the-art methods, through comprehensive quantitative and qualitative comparisons in three widely-used datasets.
arXiv Detail & Related papers (2024-09-03T07:58:47Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Improving Anomaly Segmentation with Multi-Granularity Cross-Domain
Alignment [17.086123737443714]
Anomaly segmentation plays a pivotal role in identifying atypical objects in images, crucial for hazard detection in autonomous driving systems.
While existing methods demonstrate noteworthy results on synthetic data, they often fail to consider the disparity between synthetic and real-world data domains.
We introduce the Multi-Granularity Cross-Domain Alignment framework, tailored to harmonize features across domains at both the scene and individual sample levels.
arXiv Detail & Related papers (2023-08-16T22:54:49Z) - Multi-Spectral Image Stitching via Spatial Graph Reasoning [52.27796682972484]
We propose a spatial graph reasoning based multi-spectral image stitching method.
We embed multi-scale complementary features from the same view position into a set of nodes.
By introducing long-range coherence along spatial and channel dimensions, the complementarity of pixel relations and channel interdependencies aids in the reconstruction of aligned multi-view features.
arXiv Detail & Related papers (2023-07-31T15:04:52Z) - Aligning Correlation Information for Domain Adaptation in Action
Recognition [14.586677030468339]
We propose a novel Adversa Correlation Adaptation Network (ACAN) to align action videos by aligning pixel correlations.
ACAN aims to minimize the distribution of correlation information as Pixel Correlation Discrepancy (PCD)
arXiv Detail & Related papers (2021-07-11T00:13:36Z) - Semantic Change Detection with Asymmetric Siamese Networks [71.28665116793138]
Given two aerial images, semantic change detection aims to locate the land-cover variations and identify their change types with pixel-wise boundaries.
This problem is vital in many earth vision related tasks, such as precise urban planning and natural resource management.
We present an asymmetric siamese network (ASN) to locate and identify semantic changes through feature pairs obtained from modules of widely different structures.
arXiv Detail & Related papers (2020-10-12T13:26:30Z) - Cross-Domain Facial Expression Recognition: A Unified Evaluation
Benchmark and Adversarial Graph Learning [85.6386289476598]
We develop a novel adversarial graph representation adaptation (AGRA) framework for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T15:00:31Z) - Learning to Combine: Knowledge Aggregation for Multi-Source Domain
Adaptation [56.694330303488435]
We propose a Learning to Combine for Multi-Source Domain Adaptation (LtC-MSDA) framework.
In the nutshell, a knowledge graph is constructed on the prototypes of various domains to realize the information propagation among semantically adjacent representations.
Our approach outperforms existing methods with a remarkable margin.
arXiv Detail & Related papers (2020-07-17T07:52:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.