Related papers: MFGDiffusion: Mask-Guided Smoke Synthesis for Enhanced Forest Fire Detection

MFGDiffusion: Mask-Guided Smoke Synthesis for Enhanced Forest Fire Detection

URL: http://arxiv.org/abs/2507.11252v1
Date: Tue, 15 Jul 2025 12:25:35 GMT
Title: MFGDiffusion: Mask-Guided Smoke Synthesis for Enhanced Forest Fire Detection
Authors: Guanghao Wu, Chen Xu, Hai Song, Chong Wang, Qixing Zhang,
Abstract summary: Smoke is the first visible indicator of a wildfire.<n>Current inpainting models exhibit limitations in generating high-quality smoke representations.<n>We propose a comprehensive framework for generating forest fire smoke images.
Score: 6.307649189539342
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Smoke is the first visible indicator of a wildfire.With the advancement of deep learning, image-based smoke detection has become a crucial method for detecting and preventing forest fires. However, the scarcity of smoke image data from forest fires is one of the significant factors hindering the detection of forest fire smoke. Image generation models offer a promising solution for synthesizing realistic smoke images. However, current inpainting models exhibit limitations in generating high-quality smoke representations, particularly manifesting as inconsistencies between synthesized smoke and background contexts. To solve these problems, we proposed a comprehensive framework for generating forest fire smoke images. Firstly, we employed the pre-trained segmentation model and the multimodal model to obtain smoke masks and image captions.Then, to address the insufficient utilization of masks and masked images by inpainting models, we introduced a network architecture guided by mask and masked image features. We also proposed a new loss function, the mask random difference loss, which enhances the consistency of the generated effects around the mask by randomly expanding and eroding the mask edges.Finally, to generate a smoke image dataset using random masks for subsequent detection tasks, we incorporated smoke characteristics and use a multimodal large language model as a filtering tool to select diverse and reasonable smoke images, thereby improving the quality of the synthetic dataset. Experiments showed that our generated smoke images are realistic and diverse, and effectively enhance the performance of forest fire smoke detection models. Code is available at https://github.com/wghr123/MFGDiffusion.

Related papers

A transformer boosted UNet for smoke segmentation in complex backgrounds in multispectral LandSat imagery [17.098729939840716]
Smokes present challenges in detection due to variations in density, color, lighting, and backgrounds such as clouds, haze, and/or mist. This paper proposes a new segmentation model called VTrUNet which consists of a virtual band construction module to capture spectral patterns.
arXiv Detail & Related papers (2024-06-18T23:38:24Z)
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion [61.90969199199739]
BrushNet is a novel plug-and-play dual-branch model engineered to embed pixel-level masked image features into any pre-trained DM. BrushNet's superior performance over existing models across seven key metrics, including image quality, mask region preservation, and textual coherence.
arXiv Detail & Related papers (2024-03-11T17:59:31Z)
FLAME Diffuser: Wildfire Image Synthesis using Mask Guided Diffusion [4.038140001938416]
We present a training-free, diffusion-based framework designed to generate realistic wildfire images with paired ground truth. Our framework uses augmented masks, sampled from real wildfire data, and applies Perlin noise to guide the generation of realistic flames. We evaluate the generated images using normalized Frechet Inception Distance, CLIP Score, and a custom CLIP Confidence metric.
arXiv Detail & Related papers (2024-03-06T04:59:38Z)
DiffGANPaint: Fast Inpainting Using Denoising Diffusion GANs [19.690288425689328]
In this paper, we propose a Denoising Diffusion Probabilistic Model (DDPM) based model capable of filling missing pixels fast. Experiments on general-purpose image inpainting datasets verify that our approach performs superior or on par with most contemporary works.
arXiv Detail & Related papers (2023-08-03T17:50:41Z)
Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language. We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z)
Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data. We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process. In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z)
Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness. We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z)
MaskSketch: Unpaired Structure-guided Masked Image Generation [56.88038469743742]
MaskSketch is an image generation method that allows spatial conditioning of the generation result using a guiding sketch as an extra conditioning signal during sampling. We show that intermediate self-attention maps of a masked generative transformer encode important structural information of the input image. Our results show that MaskSketch achieves high image realism and fidelity to the guiding structure.
arXiv Detail & Related papers (2023-02-10T20:27:02Z)
Multimodal Wildland Fire Smoke Detection [5.15911752972989]
Research has shown that climate change creates warmer temperatures and drier conditions, leading to longer wildfire seasons and increased wildfire risks in the U.S. We present our work on integrating multiple data sources in SmokeyNet, a deep learning model usingtemporal information to detect smoke from wildland fires. With a time-to-detection of only a few minutes, SmokeyNet can serve as an automated early notification system, providing a useful tool in the fight against destructive wildfires.
arXiv Detail & Related papers (2022-12-29T01:16:06Z)
Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness [66.55719330810547]
Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial. We propose a novel mask-aware inpainting solution that learns multi-scale features for missing regions in the encoding phase. Our framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets.
arXiv Detail & Related papers (2021-04-28T13:17:47Z)
Iterative Facial Image Inpainting using Cyclic Reverse Generator [0.913755431537592]
Cyclic Reverse Generator (CRG) architecture provides an encoder-generator model. We empirically observed that only a few iterations are sufficient to generate realistic images with the proposed model. Our method allows applying sketch-based inpaintings, using variety of mask types, and producing multiple and diverse results.
arXiv Detail & Related papers (2021-01-18T12:19:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.