Stable Rivers: A Case Study in the Application of Text-to-Image
Generative Models for Earth Sciences
- URL: http://arxiv.org/abs/2312.07833v1
- Date: Wed, 13 Dec 2023 01:40:21 GMT
- Title: Stable Rivers: A Case Study in the Application of Text-to-Image
Generative Models for Earth Sciences
- Authors: C Kupferschmidt, A.D. Binns, K.L. Kupferschmidt, and G.W Taylor
- Abstract summary: Text-to-image (TTI) generative models can be used to generate images from a given text-string input.
We evaluated subject-area specific biases in the training data and model performance of Stable Diffusion.
We found that the training data over-represented scenic locations, such as famous rivers and waterfalls, and showed serious under-representation of morphological and environmental terms.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-image (TTI) generative models can be used to generate photorealistic
images from a given text-string input. These models offer great potential to
mitigate challenges to the uptake of machine learning in the earth sciences.
However, the rapid increase in their use has raised questions about fairness
and biases, with most research to-date focusing on social and cultural areas
rather than domain-specific considerations. We conducted a case study for the
earth sciences, focusing on the field of fluvial geomorphology, where we
evaluated subject-area specific biases in the training data and downstream
model performance of Stable Diffusion (v1.5). In addition to perpetuating
Western biases, we found that the training data over-represented scenic
locations, such as famous rivers and waterfalls, and showed serious under- and
over-representation of many morphological and environmental terms. Despite
biased training data, we found that with careful prompting, the Stable
Diffusion model was able to generate photorealistic synthetic river images
reproducing many important environmental and morphological characteristics.
Furthermore, conditional control techniques, such as the use of condition maps
with ControlNet were effective for providing additional constraints on output
images. Despite great potential for the use of TTI models in the earth sciences
field, we advocate for caution in sensitive applications, and advocate for
domain-specific reviews of training data and image generation biases to
mitigate perpetuation of existing biases.
Related papers
- SatFlow: Generative model based framework for producing High Resolution Gap Free Remote Sensing Imagery [0.0]
We present SatFlow, a generative model-based framework that fuses low-resolution MODIS imagery and Landsat observations to produce frequent, high-resolution, gap-free surface reflectance imagery.
Our model, trained via Conditional Flow Matching, demonstrates better performance in generating imagery with preserved structural and spectral integrity.
This capability is crucial for downstream applications such as crop phenology tracking, environmental change detection etc.
arXiv Detail & Related papers (2025-02-03T06:40:13Z) - Underwater Image Enhancement using Generative Adversarial Networks: A Survey [1.2582887633807602]
Generative Adversarial Networks (GANs) have emerged as a powerful tool for enhancing underwater photos.
GANs have been applied to real-world applications, including marine biology and ecosystem monitoring, coral reef health assessment, underwater archaeology, and autonomous underwater vehicle (AUV) navigation.
This paper explores all major approaches to underwater image enhancement, from physical and physics-free models to CNN-based models and state-of-the-art GAN-based methods.
arXiv Detail & Related papers (2025-01-10T06:41:19Z) - Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Scaling by training on large datasets has been shown to enhance the quality and fidelity of image generation and manipulation with diffusion models.
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.
Our results demonstrate significant performance gains in various scenarios when combined with different fine-tuning schemes.
arXiv Detail & Related papers (2024-12-30T01:59:34Z) - Intrinsic Image Diffusion for Indoor Single-view Material Estimation [55.276815106443976]
We present Intrinsic Image Diffusion, a generative model for appearance decomposition of indoor scenes.
Given a single input view, we sample multiple possible material explanations represented as albedo, roughness, and metallic maps.
Our method produces significantly sharper, more consistent, and more detailed materials, outperforming state-of-the-art methods by $1.5dB$ on PSNR and by $45%$ better FID score on albedo prediction.
arXiv Detail & Related papers (2023-12-19T15:56:19Z) - A Comparative Study on Generative Models for High Resolution Solar
Observation Imaging [59.372588316558826]
This work investigates capabilities of current state-of-the-art generative models to accurately capture the data distribution behind observed solar activity states.
Using distributed training on supercomputers, we are able to train generative models for up to 1024x1024 resolution that produce high quality samples indistinguishable to human experts.
arXiv Detail & Related papers (2023-04-14T14:40:32Z) - Trade-offs in Fine-tuned Diffusion Models Between Accuracy and
Interpretability [5.865936619867771]
We unravel a consequential trade-off between image fidelity as gauged by conventional metrics and model interpretability in generative diffusion models.
We present a set of design principles for the development of truly interpretable generative models.
arXiv Detail & Related papers (2023-03-31T09:11:26Z) - Adapting Pretrained Vision-Language Foundational Models to Medical
Imaging Domains [3.8137985834223502]
Building generative models for medical images that faithfully depict clinical context may help alleviate the paucity of healthcare datasets.
We explore the sub-components of the Stable Diffusion pipeline to fine-tune the model to generate medical images.
Our best-performing model improves upon the stable diffusion baseline and can be conditioned to insert a realistic-looking abnormality on a synthetic radiology image.
arXiv Detail & Related papers (2022-10-09T01:43:08Z) - Potato Crop Stress Identification in Aerial Images using Deep
Learning-based Object Detection [60.83360138070649]
The paper presents an approach for analyzing aerial images of a potato crop using deep neural networks.
The main objective is to demonstrate automated spatial recognition of a healthy versus stressed crop at a plant level.
Experimental validation demonstrated the ability for distinguishing healthy and stressed plants in field images, achieving an average Dice coefficient of 0.74.
arXiv Detail & Related papers (2021-06-14T21:57:40Z) - Generating Physically-Consistent Satellite Imagery for Climate Visualizations [53.61991820941501]
We train a generative adversarial network to create synthetic satellite imagery of future flooding and reforestation events.
A pure deep learning-based model can generate flood visualizations but hallucinates floods at locations that were not susceptible to flooding.
We publish our code and dataset for segmentation guided image-to-image translation in Earth observation.
arXiv Detail & Related papers (2021-04-10T15:00:15Z) - Deep Low-Shot Learning for Biological Image Classification and
Visualization from Limited Training Samples [52.549928980694695]
In situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared.
labeling training data with precise stages is very time-consuming even for biologists.
We propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images.
arXiv Detail & Related papers (2020-10-20T06:06:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.