Effective Damage Data Generation by Fusing Imagery with Human Knowledge Using Vision-Language Models
- URL: http://arxiv.org/abs/2508.01380v1
- Date: Sat, 02 Aug 2025 14:22:25 GMT
- Title: Effective Damage Data Generation by Fusing Imagery with Human Knowledge Using Vision-Language Models
- Authors: Jie Wei, Erika Ardiles-Cruz, Aleksey Panasyuk, Erik Blasch,
- Abstract summary: Current deep learning approaches struggle to generalize effectively due to the imbalance of data classes and scarcity of moderate damage examples.<n>We exploit state-of-the-art techniques in vision-language models to fuse imagery with human knowledge understanding.<n>Our experimental results suggest encouraging data generation quality, which demonstrates an improvement in classifying scenes with different levels of structural damage.
- Score: 6.633325784470945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is of crucial importance to assess damages promptly and accurately in humanitarian assistance and disaster response (HADR). Current deep learning approaches struggle to generalize effectively due to the imbalance of data classes, scarcity of moderate damage examples, and human inaccuracy in pixel labeling during HADR situations. To accommodate for these limitations and exploit state-of-the-art techniques in vision-language models (VLMs) to fuse imagery with human knowledge understanding, there is an opportunity to generate a diversified set of image-based damage data effectively. Our initial experimental results suggest encouraging data generation quality, which demonstrates an improvement in classifying scenes with different levels of structural damage to buildings, roads, and infrastructures.
Related papers
- Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - Addressing the Pitfalls of Image-Based Structural Health Monitoring: A Focus on False Positives, False Negatives, and Base Rate Bias [0.0]
This study explores the limitations of image-based structural health monitoring (SHM) techniques in detecting structural damage.
The reliability of image-based SHM is impacted by challenges such as false positives, false negatives, and environmental variability.
Strategies for mitigating these limitations are discussed, including hybrid systems that combine multiple data sources.
arXiv Detail & Related papers (2024-10-27T09:15:05Z) - EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations [73.94175015918059]
We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage.
By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement.
Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
arXiv Detail & Related papers (2024-06-20T02:02:44Z) - Image Prior and Posterior Conditional Probability Representation for
Efficient Damage Assessment [51.631659414455825]
It is important to quantify Damage Assessment for Human Assistance and Disaster Response applications.
In this paper, an image prior and posterior conditional probability (IP2CP) is developed as an effective computational imaging representation.
The matching pre- and post-disaster images are effectively encoded into one image that is then processed using deep learning approaches to determine the damage levels.
arXiv Detail & Related papers (2023-10-26T22:17:37Z) - Exploring the Robustness of Human Parsers Towards Common Corruptions [99.89886010550836]
We construct three corruption robustness benchmarks, termed LIP-C, ATR-C, and Pascal-Person-Part-C, to assist us in evaluating the risk tolerance of human parsing models.
Inspired by the data augmentation strategy, we propose a novel heterogeneous augmentation-enhanced mechanism to bolster robustness under commonly corrupted conditions.
arXiv Detail & Related papers (2023-09-02T13:32:14Z) - One-class Damage Detector Using Deeper Fully-Convolutional Data
Descriptions for Civil Application [0.0]
One-class damage detection approach has an advantage in that normal images can be used to optimize model parameters.
We propose a civil-purpose application for automating one-class damage detection reproducing a fully convolutional data description (FCDD) as a baseline model.
arXiv Detail & Related papers (2023-03-03T06:27:15Z) - Multi-view deep learning for reliable post-disaster damage
classification [0.0]
This study aims to enable more reliable automated post-disaster building damage classification using artificial intelligence (AI) and multi-view imagery.
The proposed model is trained and validated on reconnaissance visual dataset containing expert-labeled, geotagged images of the inspected buildings following hurricane Harvey.
arXiv Detail & Related papers (2022-08-06T01:04:13Z) - Interpretability in Convolutional Neural Networks for Building Damage
Classification in Satellite Imagery [0.0]
We use a dataset that includes labeled pre- and post-disaster satellite imagery to assess building damage on a per-building basis.
We train multiple convolutional neural networks (CNNs) to assess building damage on a per-building basis.
Our research seeks to computationally contribute to aiding in this ongoing and growing humanitarian crisis, heightened by anthropogenic climate change.
arXiv Detail & Related papers (2022-01-24T16:55:56Z) - Assessing out-of-domain generalization for robust building damage
detection [78.6363825307044]
Building damage detection can be automated by applying computer vision techniques to satellite imagery.
Models must be robust to a shift in distribution between disaster imagery available for training and the images of the new event.
We argue that future work should focus on the OOD regime instead.
arXiv Detail & Related papers (2020-11-20T10:30:43Z) - Deep Learning Benchmarks and Datasets for Social Media Image
Classification for Disaster Response [5.610924570214424]
We propose new datasets for disaster type detection, informativeness classification, and damage severity assessment.
We benchmark several state-of-the-art deep learning models and achieve promising results.
We release our datasets and models publicly, aiming to provide proper baselines as well as to spur further research in the crisis informatics community.
arXiv Detail & Related papers (2020-11-17T20:15:49Z) - RescueNet: Joint Building Segmentation and Damage Assessment from
Satellite Imagery [83.49145695899388]
RescueNet is a unified model that can simultaneously segment buildings and assess the damage levels to individual buildings and can be trained end-to-end.
RescueNet is tested on the large scale and diverse xBD dataset and achieves significantly better building segmentation and damage classification performance than previous methods.
arXiv Detail & Related papers (2020-04-15T19:52:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.