FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided
Diffusion
- URL: http://arxiv.org/abs/2403.03463v1
- Date: Wed, 6 Mar 2024 04:59:38 GMT
- Title: FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided
Diffusion
- Authors: Hao Wang, Sayed Pedram Haeri Boroujeni, Xiwen Chen, Ashish Bastola,
Huayu Li, Abolfazl Razi
- Abstract summary: We present a dataset automata that can generate ground truth paired datasets using diffusion models.
We introduce a mask-guided diffusion framework that can fusion the wildfire into the existing images while the flame position and size can be precisely controlled.
Our proposed framework can generate a massive dataset of that images are high-quality and ground truth-paired, which well addresses the needs of the annotated datasets in specific tasks.
- Score: 4.143919750726851
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rise of machine learning in recent years has brought benefits to various
research fields such as wide fire detection. Nevertheless, small object
detection and rare object detection remain a challenge. To address this
problem, we present a dataset automata that can generate ground truth paired
datasets using diffusion models. Specifically, we introduce a mask-guided
diffusion framework that can fusion the wildfire into the existing images while
the flame position and size can be precisely controlled. In advance, to fill
the gap that the dataset of wildfire images in specific scenarios is missing,
we vary the background of synthesized images by controlling both the text
prompt and input image. Furthermore, to solve the color tint problem or the
well-known domain shift issue, we apply the CLIP model to filter the generated
massive dataset to preserve quality. Thus, our proposed framework can generate
a massive dataset of that images are high-quality and ground truth-paired,
which well addresses the needs of the annotated datasets in specific tasks.
Related papers
- Harmonizing Light and Darkness: A Symphony of Prior-guided Data Synthesis and Adaptive Focus for Nighttime Flare Removal [44.35766203309201]
Intense light sources often produce flares in captured images at night, which deteriorates the visual quality and negatively affects downstream applications.
In order to train an effective flare removal network, a reliable dataset is essential.
We synthesize a prior-guided dataset named Flare7K*, which contains multi-flare images where the brightness of flares adheres to the laws of illumination.
We propose a plug-and-play Adaptive Focus Module (AFM) that can adaptively mask the clean background areas and assist models in focusing on the regions severely affected by flares.
arXiv Detail & Related papers (2024-03-30T10:37:56Z) - DODA: Diffusion for Object-detection Domain Adaptation in Agriculture [4.549305421261851]
We propose DODA, a data synthesizer that can generate high-quality object detection data for new domains in agriculture.
Specifically, we improve the controllability of layout-to-image through encoding layout as an image, thereby improving the quality of labels.
arXiv Detail & Related papers (2024-03-27T08:16:33Z) - Scrapping The Web For Early Wildfire Detection [0.0]
Pyro is a web-scraping-based dataset composed of videos of wildfires from a network of cameras.
Our dataset was filtered based on a strategy to improve the quality and diversity of the data, reducing the final data to a set of 10,000 images.
arXiv Detail & Related papers (2024-02-08T02:01:36Z) - Exposure Bracketing is All You Need for Unifying Image Restoration and Enhancement Tasks [50.822601495422916]
We propose to utilize exposure bracketing photography to unify image restoration and enhancement tasks.
Due to the difficulty in collecting real-world pairs, we suggest a solution that first pre-trains the model with synthetic paired data.
In particular, a temporally modulated recurrent network (TMRNet) and self-supervised adaptation method are proposed.
arXiv Detail & Related papers (2024-01-01T14:14:35Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Towards Real-World Focus Stacking with Deep Learning [97.34754533628322]
We introduce a new dataset consisting of 94 high-resolution bursts of raw images with focus bracketing.
This dataset is used to train the first deep learning algorithm for focus stacking capable of handling bursts of sufficient length for real-world applications.
arXiv Detail & Related papers (2023-11-29T17:49:33Z) - RADiff: Controllable Diffusion Models for Radio Astronomical Maps
Generation [6.128112213696457]
RADiff is a generative approach based on conditional diffusion models trained over an annotated radio dataset.
We show that it is possible to generate fully-synthetic image-annotation pairs to automatically augment any annotated dataset.
arXiv Detail & Related papers (2023-07-05T16:04:44Z) - DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing [94.24479528298252]
DragGAN is an interactive point-based image editing framework that achieves impressive editing results with pixel-level precision.
By harnessing large-scale pretrained diffusion models, we greatly enhance the applicability of interactive point-based editing on both real and diffusion-generated images.
We present a challenging benchmark dataset called DragBench to evaluate the performance of interactive point-based image editing methods.
arXiv Detail & Related papers (2023-06-26T06:04:09Z) - iEdit: Localised Text-guided Image Editing with Weak Supervision [53.082196061014734]
We propose a novel learning method for text-guided image editing.
It generates images conditioned on a source image and a textual edit prompt.
It shows favourable results against its counterparts in terms of image fidelity, CLIP alignment score and qualitatively for editing both generated and real images.
arXiv Detail & Related papers (2023-05-10T07:39:14Z) - Six-channel Image Representation for Cross-domain Object Detection [17.854940064699985]
Deep learning models are data-driven and the excellent performance is highly dependent on the abundant and diverse datasets.
Some image-to-image translation techniques are employed to generate some fake data of some specific scenes to train the models.
We propose to inspire the original 3-channel images and their corresponding GAN-generated fake images to form 6-channel representations of the dataset.
arXiv Detail & Related papers (2021-01-03T04:50:03Z) - Real-MFF: A Large Realistic Multi-focus Image Dataset with Ground Truth [58.226535803985804]
We introduce a large and realistic multi-focus dataset called Real-MFF.
The dataset contains 710 pairs of source images with corresponding ground truth images.
We evaluate 10 typical multi-focus algorithms on this dataset for the purpose of illustration.
arXiv Detail & Related papers (2020-03-28T12:33:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.