Change Detection Between Optical Remote Sensing Imagery and Map Data via
Segment Anything Model (SAM)
- URL: http://arxiv.org/abs/2401.09019v1
- Date: Wed, 17 Jan 2024 07:30:52 GMT
- Title: Change Detection Between Optical Remote Sensing Imagery and Map Data via
Segment Anything Model (SAM)
- Authors: Hongruixuan Chen and Jian Song and Naoto Yokoya
- Abstract summary: We explore unsupervised multimodal change detection between two key remote sensing data sources: optical high-resolution imagery and OpenStreetMap (OSM) data.
We introduce two strategies for guiding SAM's segmentation process: the 'no-prompt' and 'box/mask prompt' methods.
Experimental results on three datasets indicate that the proposed approach can achieve more competitive results.
- Score: 20.985372561774415
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Unsupervised multimodal change detection is pivotal for time-sensitive tasks
and comprehensive multi-temporal Earth monitoring. In this study, we explore
unsupervised multimodal change detection between two key remote sensing data
sources: optical high-resolution imagery and OpenStreetMap (OSM) data.
Specifically, we propose to utilize the vision foundation model Segmentation
Anything Model (SAM), for addressing our task. Leveraging SAM's exceptional
zero-shot transfer capability, high-quality segmentation maps of optical images
can be obtained. Thus, we can directly compare these two heterogeneous data
forms in the so-called segmentation domain. We then introduce two strategies
for guiding SAM's segmentation process: the 'no-prompt' and 'box/mask prompt'
methods. The two strategies are designed to detect land-cover changes in
general scenarios and to identify new land-cover objects within existing
backgrounds, respectively. Experimental results on three datasets indicate that
the proposed approach can achieve more competitive results compared to
representative unsupervised multimodal change detection methods.
Related papers
- DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with
Competitive Query Selection and Adaptive Feature Fusion [82.2425759608975]
Infrared-visible object detection aims to achieve robust even full-day object detection by fusing the complementary information of infrared and visible images.
We propose a Dynamic Adaptive Multispectral Detection Transformer (DAMSDet) to address these two challenges.
Experiments on four public datasets demonstrate significant improvements compared to other state-of-the-art methods.
arXiv Detail & Related papers (2024-03-01T07:03:27Z) - Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - A Dual Attentive Generative Adversarial Network for Remote Sensing Image
Change Detection [6.906936669510404]
We propose a dual attentive generative adversarial network for achieving very high-resolution remote sensing image change detection tasks.
The DAGAN framework has better performance with 85.01% mean IoU and 91.48% mean F1 score than advanced methods on the LEVIR dataset.
arXiv Detail & Related papers (2023-10-03T08:26:27Z) - Self-supervised Domain-agnostic Domain Adaptation for Satellite Images [18.151134198549574]
We propose an self-supervised domain-agnostic domain adaptation (SS(DA)2) method to perform domain adaptation without such a domain definition.
We first design a contrastive generative adversarial loss to train a generative network to perform image-to-image translation between any two satellite image patches.
Then, we improve the generalizability of the downstream models by augmenting the training data with different testing spectral characteristics.
arXiv Detail & Related papers (2023-09-20T07:37:23Z) - Multimodal Across Domains Gaze Target Detection [18.41238482101682]
This paper addresses the gaze target detection problem in single images captured from the third-person perspective.
We present a multimodal deep architecture to infer where a person in a scene is looking.
arXiv Detail & Related papers (2022-08-23T09:09:00Z) - Supervising Remote Sensing Change Detection Models with 3D Surface
Semantics [1.8782750537161614]
We propose Contrastive Surface-Image Pretraining (CSIP) for joint learning using optical RGB and above ground level (AGL) map pairs.
We then evaluate these pretrained models on several building segmentation and change detection datasets to show that our method does, in fact, extract features relevant to downstream applications.
arXiv Detail & Related papers (2022-02-26T23:35:43Z) - Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust
Road Extraction [110.61383502442598]
We introduce a novel neural network framework termed Cross-Modal Message Propagation Network (CMMPNet)
CMMPNet is composed of two deep Auto-Encoders for modality-specific representation learning and a tailor-designed Dual Enhancement Module for cross-modal representation refinement.
Experiments on three real-world benchmarks demonstrate the effectiveness of our CMMPNet for robust road extraction.
arXiv Detail & Related papers (2021-11-30T04:30:10Z) - Know Your Surroundings: Panoramic Multi-Object Tracking by Multimodality
Collaboration [56.01625477187448]
We propose a MultiModality PAnoramic multi-object Tracking framework (MMPAT)
It takes both 2D panorama images and 3D point clouds as input and then infers target trajectories using the multimodality data.
We evaluate the proposed method on the JRDB dataset, where the MMPAT achieves the top performance in both the detection and tracking tasks.
arXiv Detail & Related papers (2021-05-31T03:16:38Z) - Cross-Modality Brain Tumor Segmentation via Bidirectional
Global-to-Local Unsupervised Domain Adaptation [61.01704175938995]
In this paper, we propose a novel Bidirectional Global-to-Local (BiGL) adaptation framework under a UDA scheme.
Specifically, a bidirectional image synthesis and segmentation module is proposed to segment the brain tumor.
The proposed method outperforms several state-of-the-art unsupervised domain adaptation methods by a large margin.
arXiv Detail & Related papers (2021-05-17T10:11:45Z) - Semantic Change Detection with Asymmetric Siamese Networks [71.28665116793138]
Given two aerial images, semantic change detection aims to locate the land-cover variations and identify their change types with pixel-wise boundaries.
This problem is vital in many earth vision related tasks, such as precise urban planning and natural resource management.
We present an asymmetric siamese network (ASN) to locate and identify semantic changes through feature pairs obtained from modules of widely different structures.
arXiv Detail & Related papers (2020-10-12T13:26:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.