UrbanCross: Enhancing Satellite Image-Text Retrieval with Cross-Domain Adaptation
- URL: http://arxiv.org/abs/2404.14241v1
- Date: Mon, 22 Apr 2024 14:53:27 GMT
- Title: UrbanCross: Enhancing Satellite Image-Text Retrieval with Cross-Domain Adaptation
- Authors: Siru Zhong, Xixuan Hao, Yibo Yan, Ying Zhang, Yangqiu Song, Yuxuan Liang,
- Abstract summary: UrbanCross is a new framework for cross-domain satellite image-text retrieval.
It exploits a high-quality, cross-domain dataset enriched with extensive geo-tags from three countries.
Experiments confirm UrbanCross's superior efficiency in retrieval and adaptation to new urban environments.
- Score: 44.14603461930746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Urbanization challenges underscore the necessity for effective satellite image-text retrieval methods to swiftly access specific information enriched with geographic semantics for urban applications. However, existing methods often overlook significant domain gaps across diverse urban landscapes, primarily focusing on enhancing retrieval performance within single domains. To tackle this issue, we present UrbanCross, a new framework for cross-domain satellite image-text retrieval. UrbanCross leverages a high-quality, cross-domain dataset enriched with extensive geo-tags from three countries to highlight domain diversity. It employs the Large Multimodal Model (LMM) for textual refinement and the Segment Anything Model (SAM) for visual augmentation, achieving a fine-grained alignment of images, segments and texts, yielding a 10% improvement in retrieval performance. Additionally, UrbanCross incorporates an adaptive curriculum-based source sampler and a weighted adversarial cross-domain fine-tuning module, progressively enhancing adaptability across various domains. Extensive experiments confirm UrbanCross's superior efficiency in retrieval and adaptation to new urban environments, demonstrating an average performance increase of 15% over its version without domain adaptation mechanisms, effectively bridging the domain gap.
Related papers
- Cross Pseudo Supervision Framework for Sparsely Labelled Geospatial Images [0.0]
Land Use Land Cover (LULC) mapping is a vital tool for urban and resource planning.
This study introduces a semi-supervised segmentation model for LULC prediction using high-resolution satellite images.
We propose a modified Cross Pseudo Supervision framework to train image segmentation models on sparsely labelled data.
arXiv Detail & Related papers (2024-08-05T11:14:23Z) - Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning [50.88504784466931]
Multi-task dense prediction involves semantic segmentation, depth estimation, and surface normal estimation.
Existing solutions typically rely on learning global image representations for global cross-task image matching.
Our proposal involves modeling region-wise representations using Gaussian Distributions.
arXiv Detail & Related papers (2024-03-15T12:41:30Z) - Cross-City Matters: A Multimodal Remote Sensing Benchmark Dataset for
Cross-City Semantic Segmentation using High-Resolution Domain Adaptation
Networks [82.82866901799565]
We build a new set of multimodal remote sensing benchmark datasets (including hyperspectral, multispectral, SAR) for the study purpose of the cross-city semantic segmentation task.
Beyond the single city, we propose a high-resolution domain adaptation network, HighDAN, to promote the AI model's generalization ability from the multi-city environments.
HighDAN is capable of retaining the spatially topological structure of the studied urban scene well in a parallel high-to-low resolution fusion fashion.
arXiv Detail & Related papers (2023-09-26T23:55:39Z) - Segmenting across places: The need for fair transfer learning with
satellite imagery [24.087993065704527]
State-of-the-art models have better overall accuracy in rural areas compared to urban areas.
We show that raw satellite images are overall more dissimilar between source and target districts for rural than for urban locations.
arXiv Detail & Related papers (2022-04-09T02:14:56Z) - WEDGE: Web-Image Assisted Domain Generalization for Semantic
Segmentation [72.88657378658549]
We propose a WEb-image assisted Domain GEneralization scheme, which is the first to exploit the diversity of web-crawled images for generalizable semantic segmentation.
We also present a method which injects styles of the web-crawled images into training images on-the-fly during training, which enables the network to experience images of diverse styles with reliable labels for effective training.
arXiv Detail & Related papers (2021-09-29T05:19:58Z) - Domain Adaptive Person Re-Identification via Coupling Optimization [58.567492812339566]
Domain adaptive person Re-Identification (ReID) is challenging owing to the domain gap and shortage of annotations on target scenarios.
This paper proposes a coupling optimization method including the Domain-Invariant Mapping (DIM) method and the Global-Local distance Optimization ( GLO)
GLO is designed to train the ReID model with unsupervised setting on the target domain.
arXiv Detail & Related papers (2020-11-06T14:01:03Z) - DASGIL: Domain Adaptation for Semantic and Geometric-aware Image-based
Localization [27.294822556484345]
Long-term visual localization under changing environments is a challenging problem in autonomous driving and mobile robotics.
We propose a novel multi-task architecture to fuse the geometric and semantic information into the multi-scale latent embedding representation for visual place recognition.
arXiv Detail & Related papers (2020-10-01T17:44:25Z) - Weakly Supervised Domain Adaptation for Built-up Region Segmentation in
Aerial and Satellite Imagery [3.8508264614798517]
Built-up area estimation is an important component in understanding the human impact on the environment, the effect of public policy, and general urban population analysis.
The diverse nature of aerial and satellite imagery and lack of labeled data covering this diversity makes machine learning algorithms difficult to generalize.
This paper proposes a novel domain adaptation algorithm to handle the challenges posed by the satellite and aerial imagery.
arXiv Detail & Related papers (2020-07-05T10:05:01Z) - CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency [119.45667331836583]
Unsupervised domain adaptation algorithms aim to transfer the knowledge learned from one domain to another.
We present a novel pixel-wise adversarial domain adaptation algorithm.
arXiv Detail & Related papers (2020-01-09T19:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.