Related papers: Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data

Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data

URL: http://arxiv.org/abs/2405.13779v1
Date: Wed, 22 May 2024 16:07:05 GMT
Title: Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data
Authors: Tarun Kalluri, Jihyeon Lee, Kihyuk Sohn, Sahil Singla, Manmohan Chandraker, Joseph Xu, Jeremiah Liu,
Abstract summary: We leverage emerging text-to-image generative models in creating large-scale synthetic supervision for the task of damage assessment from aerial images. We build an efficient and easily scalable pipeline to generate thousands of post-disaster images from low-resource domains. We validate the strength of our proposed framework under cross-geography domain transfer setting from xBD and SKAI images in both single-source and multi-source settings.
Score: 66.49494950674402
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a simple and efficient method to leverage emerging text-to-image generative models in creating large-scale synthetic supervision for the task of damage assessment from aerial images. While significant recent advances have resulted in improved techniques for damage assessment using aerial or satellite imagery, they still suffer from poor robustness to domains where manual labeled data is unavailable, directly impacting post-disaster humanitarian assistance in such under-resourced geographies. Our contribution towards improving domain robustness in this scenario is two-fold. Firstly, we leverage the text-guided mask-based image editing capabilities of generative models and build an efficient and easily scalable pipeline to generate thousands of post-disaster images from low-resource domains. Secondly, we propose a simple two-stage training approach to train robust models while using manual supervision from different source domains along with the generated synthetic target domain data. We validate the strength of our proposed framework under cross-geography domain transfer setting from xBD and SKAI images in both single-source and multi-source settings, achieving significant improvements over a source-only baseline in each case.

Related papers

AerialMegaDepth: Learning Aerial-Ground Reconstruction and View Synthesis [57.249817395828174]
We propose a scalable framework combining pseudo-synthetic renderings from 3D city-wide meshes with real, ground-level crowd-sourced images. The pseudo-synthetic data simulates a wide range of aerial viewpoints, while the real, crowd-sourced images help improve visual fidelity for ground-level images. Using this hybrid dataset, we fine-tune several state-of-the-art algorithms and achieve significant improvements on real-world, zero-shot aerial-ground tasks.
arXiv Detail & Related papers (2025-04-17T17:57:05Z)
Reframing Image Difference Captioning with BLIP2IDC and Synthetic Augmentation [5.887986127737718]
We introduce BLIP2IDC, an adaptation of BLIP2 to the Image Difference Captioning (IDC) task at low computational cost. We show it outperforms two-streams approaches by a significant margin on real-world IDC datasets. We also propose to use synthetic augmentation to improve the performance of IDC models in an agnostic fashion.
arXiv Detail & Related papers (2024-12-20T14:32:56Z)
FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [66.61201445650323]
Existing methods suffer from a generalization bottleneck in real-world scenarios. We contribute a million-scale dataset with two notable advantages over existing training data. We propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios.
arXiv Detail & Related papers (2024-12-02T12:08:40Z)
Semantic Segmentation for Real-World and Synthetic Vehicle's Forward-Facing Camera Images [0.8562182926816566]
This is the solution for semantic segmentation problem in both real-world and synthetic images from a vehicle s forward-facing camera. We concentrate in building a robust model which performs well across various domains of different outdoor situations. This paper studies the effectiveness of employing real-world and synthetic data to handle the domain adaptation in semantic segmentation problem.
arXiv Detail & Related papers (2024-07-07T17:28:45Z)
Source-Free Online Domain Adaptive Semantic Segmentation of Satellite Images under Image Degradation [20.758637391023345]
We address source-free and online domain adaptation, i.e., test-time adaptation (TTA) for satellite images. We propose a novel TTA approach involving two effective strategies. First, we progressively estimate the global Batch Normalization statistics of the target distribution with incoming data stream. Second, we enhance prediction quality by refining the predicted masks using global class centers.
arXiv Detail & Related papers (2024-01-04T07:49:32Z)
Enhancing Visual Domain Adaptation with Source Preparation [5.287588907230967]
Domain Adaptation techniques fail to consider the characteristics of the source domain itself. We propose Source Preparation (SP), a method to mitigate source domain biases. We show that SP enhances UDA across a range of visual domains, with improvements up to 40.64% in mIoU over baseline.
arXiv Detail & Related papers (2023-06-16T18:56:44Z)
Source-Free Domain Adaptation for Real-world Image Dehazing [10.26945164141663]
We present a novel Source-Free Unsupervised Domain Adaptation (SFUDA) image dehazing paradigm. We devise the Domain Representation Normalization (DRN) module to make the representation of real hazy domain features match that of the synthetic domain. With our plug-and-play DRN module, unlabeled real hazy images can adapt existing well-trained source networks.
arXiv Detail & Related papers (2022-07-14T03:37:25Z)
Single Image Internal Distribution Measurement Using Non-Local Variational Autoencoder [11.985083962982909]
This paper proposes a novel image-specific solution, namely non-local variational autoencoder (textttNLVAE) textttNLVAE is introduced as a self-supervised strategy that reconstructs high-resolution images using disentangled information from the non-local neighbourhood. Experimental results from seven benchmark datasets demonstrate the effectiveness of the textttNLVAE model.
arXiv Detail & Related papers (2022-04-02T18:43:55Z)
Domain Adaptation for Underwater Image Enhancement [51.71570701102219]
We propose a novel Two-phase Underwater Domain Adaptation network (TUDA) to minimize the inter-domain and intra-domain gap. In the first phase, a new dual-alignment network is designed, including a translation part for enhancing realism of input images, followed by an enhancement part. In the second phase, we perform an easy-hard classification of real data according to the assessed quality of enhanced images, where a rank-based underwater quality assessment method is embedded.
arXiv Detail & Related papers (2021-08-22T06:38:19Z)
Towards Unsupervised Sketch-based Image Retrieval [126.77787336692802]
We introduce a novel framework that simultaneously performs unsupervised representation learning and sketch-photo domain alignment. Our framework achieves excellent performance in the new unsupervised setting, and performs comparably or better than state-of-the-art in the zero-shot setting.
arXiv Detail & Related papers (2021-05-18T02:38:22Z)
Enhancing Photorealism Enhancement [83.88433283714461]
We present an approach to enhancing the realism of synthetic images using a convolutional network. We analyze scene layout distributions in commonly used datasets and find that they differ in important ways. We report substantial gains in stability and realism in comparison to recent image-to-image translation methods.
arXiv Detail & Related papers (2021-05-10T19:00:49Z)
Few-shot Image Generation via Cross-domain Correspondence [98.2263458153041]
Training generative models, such as GANs, on a target domain containing limited examples can easily result in overfitting. In this work, we seek to utilize a large source domain for pretraining and transfer the diversity information from source to target. To further reduce overfitting, we present an anchor-based strategy to encourage different levels of realism over different regions in the latent space.
arXiv Detail & Related papers (2021-04-13T17:59:35Z)
Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF. We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials. Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.