Stable Rivers: A Case Study in the Application of Text-to-Image
Generative Models for Earth Sciences
- URL: http://arxiv.org/abs/2312.07833v1
- Date: Wed, 13 Dec 2023 01:40:21 GMT
- Title: Stable Rivers: A Case Study in the Application of Text-to-Image
Generative Models for Earth Sciences
- Authors: C Kupferschmidt, A.D. Binns, K.L. Kupferschmidt, and G.W Taylor
- Abstract summary: Text-to-image (TTI) generative models can be used to generate images from a given text-string input.
We evaluated subject-area specific biases in the training data and model performance of Stable Diffusion.
We found that the training data over-represented scenic locations, such as famous rivers and waterfalls, and showed serious under-representation of morphological and environmental terms.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-image (TTI) generative models can be used to generate photorealistic
images from a given text-string input. These models offer great potential to
mitigate challenges to the uptake of machine learning in the earth sciences.
However, the rapid increase in their use has raised questions about fairness
and biases, with most research to-date focusing on social and cultural areas
rather than domain-specific considerations. We conducted a case study for the
earth sciences, focusing on the field of fluvial geomorphology, where we
evaluated subject-area specific biases in the training data and downstream
model performance of Stable Diffusion (v1.5). In addition to perpetuating
Western biases, we found that the training data over-represented scenic
locations, such as famous rivers and waterfalls, and showed serious under- and
over-representation of many morphological and environmental terms. Despite
biased training data, we found that with careful prompting, the Stable
Diffusion model was able to generate photorealistic synthetic river images
reproducing many important environmental and morphological characteristics.
Furthermore, conditional control techniques, such as the use of condition maps
with ControlNet were effective for providing additional constraints on output
images. Despite great potential for the use of TTI models in the earth sciences
field, we advocate for caution in sensitive applications, and advocate for
domain-specific reviews of training data and image generation biases to
mitigate perpetuation of existing biases.
Related papers
- Comparative Analysis of Generative Models: Enhancing Image Synthesis with VAEs, GANs, and Stable Diffusion [0.0]
This paper examines three major generative modelling frameworks: Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs) and Stable Diffusion models.
arXiv Detail & Related papers (2024-08-16T13:50:50Z) - YaART: Yet Another ART Rendering Technology [119.09155882164573]
This study introduces YaART, a novel production-grade text-to-image cascaded diffusion model aligned to human preferences.
We analyze how these choices affect both the efficiency of the training process and the quality of the generated images.
We demonstrate that models trained on smaller datasets of higher-quality images can successfully compete with those trained on larger datasets.
arXiv Detail & Related papers (2024-04-08T16:51:19Z) - Intrinsic Image Diffusion for Indoor Single-view Material Estimation [55.276815106443976]
We present Intrinsic Image Diffusion, a generative model for appearance decomposition of indoor scenes.
Given a single input view, we sample multiple possible material explanations represented as albedo, roughness, and metallic maps.
Our method produces significantly sharper, more consistent, and more detailed materials, outperforming state-of-the-art methods by $1.5dB$ on PSNR and by $45%$ better FID score on albedo prediction.
arXiv Detail & Related papers (2023-12-19T15:56:19Z) - Rapid Training Data Creation by Synthesizing Medical Images for
Classification and Localization [10.506584969668792]
We present a method for the transformation of real data to train any Deep Neural Network to solve the above problems.
For the weakly supervised model, we show that the localization accuracy increases significantly using the generated data.
In the latter model, we show that the accuracy, when trained with generated images, closely parallels the accuracy when trained with exhaustively annotated real images.
arXiv Detail & Related papers (2023-08-09T03:49:12Z) - A Comparative Study on Generative Models for High Resolution Solar
Observation Imaging [59.372588316558826]
This work investigates capabilities of current state-of-the-art generative models to accurately capture the data distribution behind observed solar activity states.
Using distributed training on supercomputers, we are able to train generative models for up to 1024x1024 resolution that produce high quality samples indistinguishable to human experts.
arXiv Detail & Related papers (2023-04-14T14:40:32Z) - Trade-offs in Fine-tuned Diffusion Models Between Accuracy and
Interpretability [5.865936619867771]
We unravel a consequential trade-off between image fidelity as gauged by conventional metrics and model interpretability in generative diffusion models.
We present a set of design principles for the development of truly interpretable generative models.
arXiv Detail & Related papers (2023-03-31T09:11:26Z) - Adapting Pretrained Vision-Language Foundational Models to Medical
Imaging Domains [3.8137985834223502]
Building generative models for medical images that faithfully depict clinical context may help alleviate the paucity of healthcare datasets.
We explore the sub-components of the Stable Diffusion pipeline to fine-tune the model to generate medical images.
Our best-performing model improves upon the stable diffusion baseline and can be conditioned to insert a realistic-looking abnormality on a synthetic radiology image.
arXiv Detail & Related papers (2022-10-09T01:43:08Z) - Potato Crop Stress Identification in Aerial Images using Deep
Learning-based Object Detection [60.83360138070649]
The paper presents an approach for analyzing aerial images of a potato crop using deep neural networks.
The main objective is to demonstrate automated spatial recognition of a healthy versus stressed crop at a plant level.
Experimental validation demonstrated the ability for distinguishing healthy and stressed plants in field images, achieving an average Dice coefficient of 0.74.
arXiv Detail & Related papers (2021-06-14T21:57:40Z) - Generating Physically-Consistent Satellite Imagery for Climate Visualizations [53.61991820941501]
We train a generative adversarial network to create synthetic satellite imagery of future flooding and reforestation events.
A pure deep learning-based model can generate flood visualizations but hallucinates floods at locations that were not susceptible to flooding.
We publish our code and dataset for segmentation guided image-to-image translation in Earth observation.
arXiv Detail & Related papers (2021-04-10T15:00:15Z) - Deep Low-Shot Learning for Biological Image Classification and
Visualization from Limited Training Samples [52.549928980694695]
In situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared.
labeling training data with precise stages is very time-consuming even for biologists.
We propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images.
arXiv Detail & Related papers (2020-10-20T06:06:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.