FMARS: Annotating Remote Sensing Images for Disaster Management using Foundation Models
- URL: http://arxiv.org/abs/2405.20109v2
- Date: Thu, 20 Jun 2024 09:12:37 GMT
- Title: FMARS: Annotating Remote Sensing Images for Disaster Management using Foundation Models
- Authors: Edoardo Arnaudo, Jacopo Lungo Vaschetti, Lorenzo Innocenti, Luca Barco, Davide Lisi, Vanina Fissore, Claudio Rossi,
- Abstract summary: FMARS (Foundation Model s in Remote Sensing) is a methodology leveraging VHR imagery and foundation models for fast and robust annotation.
We focus on disaster management and provide a large-scale dataset with labels obtained from pre-event imagery over 19 disaster events.
We train segmentation models on the generated labels, using Unsupervised Adaptation (UDA) techniques to increase transferability to real-world scenarios.
- Score: 0.8795040582681392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Very-High Resolution (VHR) remote sensing imagery is increasingly accessible, but often lacks annotations for effective machine learning applications. Recent foundation models like GroundingDINO and Segment Anything (SAM) provide opportunities to automatically generate annotations. This study introduces FMARS (Foundation Model Annotations in Remote Sensing), a methodology leveraging VHR imagery and foundation models for fast and robust annotation. We focus on disaster management and provide a large-scale dataset with labels obtained from pre-event imagery over 19 disaster events, derived from the Maxar Open Data initiative. We train segmentation models on the generated labels, using Unsupervised Domain Adaptation (UDA) techniques to increase transferability to real-world scenarios. Our results demonstrate the effectiveness of leveraging foundation models to automatically annotate remote sensing data at scale, enabling robust downstream models for critical applications. Code and dataset are available at \url{https://github.com/links-ads/igarss-fmars}.
Related papers
- DRAG: Data Reconstruction Attack using Guided Diffusion [20.2532929124365]
We propose a novel data reconstruction attack based on guided diffusion, which leverages the rich prior knowledge embedded in a latent diffusion model (LDM) pre-trained on a large-scale dataset.<n>Our approach significantly outperforms state-of-the-art methods, both qualitatively and quantitatively, in reconstructing data from deep-layer IRs of the vision foundation model.
arXiv Detail & Related papers (2025-09-15T09:26:19Z) - Zero-Shot Image Anomaly Detection Using Generative Foundation Models [2.241618130319058]
This research explores the use of score-based generative models as foundational tools for semantic anomaly detection.<n>By analyzing Stein score errors, we introduce a novel method for identifying anomalous samples without requiring re-training on each target dataset.<n>Our approach improves over state-of-the-art and relies on training a single model on one dataset -- CelebA -- which we find to be an effective base distribution.
arXiv Detail & Related papers (2025-07-30T13:56:36Z) - DUSE: A Data Expansion Framework for Low-resource Automatic Modulation Recognition based on Active Learning [17.651073556023167]
We introduce a data expansion framework called Dynamic Uncertainty-driven Sample Expansion (DUSE)<n>DUSE uses an uncertainty scoring function to filter out useful samples from relevant AMR datasets.<n>Experiments demonstrate that DUSE consistently outperforms 8 coreset selection baselines in both class-balance and class-imbalance settings.
arXiv Detail & Related papers (2025-07-16T08:09:41Z) - Continuous Visual Autoregressive Generation via Score Maximization [69.67438563485887]
We introduce a Continuous VAR framework that enables direct visual autoregressive generation without vector quantization.<n>Within this framework, all we need is to select a strictly proper score and set it as the training objective to optimize.
arXiv Detail & Related papers (2025-05-12T17:58:14Z) - SAM2-ELNet: Label Enhancement and Automatic Annotation for Remote Sensing Segmentation [1.7906674529114799]
This work proposes an Edge-enhanced Labeling Network, called SAM2-ELNet, which incorporates a labeling module and an edge attention mechanism.
Our method uses the Hiera backbone of the pre-trained self-supervised large model segment anything model 2 (SAM2) as the encoder.
Results showed that the model trained with enhanced labels performed better and had a lower final loss, indicating closer alignment with the real data distribution.
arXiv Detail & Related papers (2025-03-16T08:11:11Z) - A Recipe for Improving Remote Sensing VLM Zero Shot Generalization [0.4427533728730559]
We present two novel image-caption datasets for training of remote sensing foundation models.
The first dataset pairs aerial and satellite imagery with captions generated by Gemini using landmarks extracted from Google Maps.
The second dataset utilizes public web images and their corresponding alt-text, filtered for the remote sensing domain.
arXiv Detail & Related papers (2025-03-10T21:09:02Z) - Boosting Salient Object Detection with Knowledge Distillated from Large Foundation Models [7.898092154590899]
Salient Object Detection aims to identify and segment prominent regions within a scene.
Traditional models rely on manually annotated pseudo labels with precise pixel-level accuracy.
We develop a low-cost, high-precision annotation method to address the challenges.
arXiv Detail & Related papers (2025-01-08T15:56:21Z) - A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA)
CEFA consists of a feature alignment module and a context enhancement module.
Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z) - InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images [11.916941756499435]
In this paper, we explore the intricate task of incremental few-shot object detection in remote sensing images.
We introduce a pioneering fine-tuning-based technique, termed InfRS, designed to facilitate the incremental learning of novel classes.
We develop a prototypical calibration strategy based on the Wasserstein distance to mitigate the catastrophic forgetting problem.
arXiv Detail & Related papers (2024-05-18T13:39:50Z) - Few-shot Online Anomaly Detection and Segmentation [29.693357653538474]
This paper focuses on addressing the challenging yet practical few-shot online anomaly detection and segmentation (FOADS) task.
Under the FOADS framework, models are trained on a few-shot normal dataset, followed by inspection and improvement of their capabilities by leveraging unlabeled streaming data containing both normal and abnormal samples simultaneously.
In order to achieve improved performance with limited training samples, we employ multi-scale feature embedding extracted from a CNN pre-trained on ImageNet to obtain a robust representation.
arXiv Detail & Related papers (2024-03-27T02:24:00Z) - Text2Seg: Remote Sensing Image Semantic Segmentation via Text-Guided Visual Foundation Models [7.452422412106768]
We propose a novel method named Text2Seg for remote sensing semantic segmentation.
It overcomes the dependency on extensive annotations by employing an automatic prompt generation process.
We show that Text2Seg significantly improves zero-shot prediction performance compared to the vanilla SAM model.
arXiv Detail & Related papers (2023-04-20T18:39:41Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - SRoUDA: Meta Self-training for Robust Unsupervised Domain Adaptation [25.939292305808934]
Unsupervised domain adaptation (UDA) can transfer knowledge learned from rich-label dataset to unlabeled target dataset.
In this paper, we present a new meta self-training pipeline, named SRoUDA, for improving adversarial robustness of UDA models.
arXiv Detail & Related papers (2022-12-12T14:25:40Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal
Relationships [8.679073301435265]
We construct a new benchmark for evaluating and improving model robustness by applying perturbations to existing data.
We use these labels to perturb the data by deleting non-causal agents from the scene.
Under non-causal perturbations, we observe a $25$-$38%$ relative change in minADE as compared to the original.
arXiv Detail & Related papers (2022-07-07T21:28:23Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Self-supervised Audiovisual Representation Learning for Remote Sensing Data [96.23611272637943]
We propose a self-supervised approach for pre-training deep neural networks in remote sensing.
By exploiting the correspondence between geo-tagged audio recordings and remote sensing, this is done in a completely label-free manner.
We show that our approach outperforms existing pre-training strategies for remote sensing imagery.
arXiv Detail & Related papers (2021-08-02T07:50:50Z) - Semi-Supervised Domain Adaptation with Prototypical Alignment and
Consistency Learning [86.6929930921905]
This paper studies how much it can help address domain shifts if we further have a few target samples labeled.
To explore the full potential of landmarks, we incorporate a prototypical alignment (PA) module which calculates a target prototype for each class from the landmarks.
Specifically, we severely perturb the labeled images, making PA non-trivial to achieve and thus promoting model generalizability.
arXiv Detail & Related papers (2021-04-19T08:46:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.