Related papers: Stable Rivers: A Case Study in the Application of Text-to-Image Generative Models for Earth Sciences

Stable Rivers: A Case Study in the Application of Text-to-Image Generative Models for Earth Sciences

URL: http://arxiv.org/abs/2312.07833v1
Date: Wed, 13 Dec 2023 01:40:21 GMT
Title: Stable Rivers: A Case Study in the Application of Text-to-Image Generative Models for Earth Sciences
Authors: C Kupferschmidt, A.D. Binns, K.L. Kupferschmidt, and G.W Taylor
Abstract summary: Text-to-image (TTI) generative models can be used to generate images from a given text-string input. We evaluated subject-area specific biases in the training data and model performance of Stable Diffusion. We found that the training data over-represented scenic locations, such as famous rivers and waterfalls, and showed serious under-representation of morphological and environmental terms.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-to-image (TTI) generative models can be used to generate photorealistic images from a given text-string input. These models offer great potential to mitigate challenges to the uptake of machine learning in the earth sciences. However, the rapid increase in their use has raised questions about fairness and biases, with most research to-date focusing on social and cultural areas rather than domain-specific considerations. We conducted a case study for the earth sciences, focusing on the field of fluvial geomorphology, where we evaluated subject-area specific biases in the training data and downstream model performance of Stable Diffusion (v1.5). In addition to perpetuating Western biases, we found that the training data over-represented scenic locations, such as famous rivers and waterfalls, and showed serious under- and over-representation of many morphological and environmental terms. Despite biased training data, we found that with careful prompting, the Stable Diffusion model was able to generate photorealistic synthetic river images reproducing many important environmental and morphological characteristics. Furthermore, conditional control techniques, such as the use of condition maps with ControlNet were effective for providing additional constraints on output images. Despite great potential for the use of TTI models in the earth sciences field, we advocate for caution in sensitive applications, and advocate for domain-specific reviews of training data and image generation biases to mitigate perpetuation of existing biases.

Related papers

Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation. We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation.
arXiv Detail & Related papers (2024-12-30T01:59:34Z)
On the Generalizability of Foundation Models for Crop Type Mapping [8.346555291145767]
Foundation models pre-trained using self-supervised learning have shown powerful transfer learning capabilities. We investigate the ability of popular EO foundation models to transfer to new geographic regions in the agricultural domain.
arXiv Detail & Related papers (2024-09-14T14:43:57Z)
Comparative Analysis of Generative Models: Enhancing Image Synthesis with VAEs, GANs, and Stable Diffusion [0.0]
This paper examines three major generative modelling frameworks: Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs) and Stable Diffusion models.
arXiv Detail & Related papers (2024-08-16T13:50:50Z)
YaART: Yet Another ART Rendering Technology [119.09155882164573]
This study introduces YaART, a novel production-grade text-to-image cascaded diffusion model aligned to human preferences. We analyze how these choices affect both the efficiency of the training process and the quality of the generated images. We demonstrate that models trained on smaller datasets of higher-quality images can successfully compete with those trained on larger datasets.
arXiv Detail & Related papers (2024-04-08T16:51:19Z)
Intrinsic Image Diffusion for Indoor Single-view Material Estimation [55.276815106443976]
We present Intrinsic Image Diffusion, a generative model for appearance decomposition of indoor scenes. Given a single input view, we sample multiple possible material explanations represented as albedo, roughness, and metallic maps. Our method produces significantly sharper, more consistent, and more detailed materials, outperforming state-of-the-art methods by $1.5dB$ on PSNR and by $45%$ better FID score on albedo prediction.
arXiv Detail & Related papers (2023-12-19T15:56:19Z)
Rapid Training Data Creation by Synthesizing Medical Images for Classification and Localization [10.506584969668792]
We present a method for the transformation of real data to train any Deep Neural Network to solve the above problems. For the weakly supervised model, we show that the localization accuracy increases significantly using the generated data. In the latter model, we show that the accuracy, when trained with generated images, closely parallels the accuracy when trained with exhaustively annotated real images.
arXiv Detail & Related papers (2023-08-09T03:49:12Z)
A Comparative Study on Generative Models for High Resolution Solar Observation Imaging [59.372588316558826]
This work investigates capabilities of current state-of-the-art generative models to accurately capture the data distribution behind observed solar activity states. Using distributed training on supercomputers, we are able to train generative models for up to 1024x1024 resolution that produce high quality samples indistinguishable to human experts.
arXiv Detail & Related papers (2023-04-14T14:40:32Z)
Trade-offs in Fine-tuned Diffusion Models Between Accuracy and Interpretability [5.865936619867771]
We unravel a consequential trade-off between image fidelity as gauged by conventional metrics and model interpretability in generative diffusion models. We present a set of design principles for the development of truly interpretable generative models.
arXiv Detail & Related papers (2023-03-31T09:11:26Z)
Adapting Pretrained Vision-Language Foundational Models to Medical Imaging Domains [3.8137985834223502]
Building generative models for medical images that faithfully depict clinical context may help alleviate the paucity of healthcare datasets. We explore the sub-components of the Stable Diffusion pipeline to fine-tune the model to generate medical images. Our best-performing model improves upon the stable diffusion baseline and can be conditioned to insert a realistic-looking abnormality on a synthetic radiology image.
arXiv Detail & Related papers (2022-10-09T01:43:08Z)
Potato Crop Stress Identification in Aerial Images using Deep Learning-based Object Detection [60.83360138070649]
The paper presents an approach for analyzing aerial images of a potato crop using deep neural networks. The main objective is to demonstrate automated spatial recognition of a healthy versus stressed crop at a plant level. Experimental validation demonstrated the ability for distinguishing healthy and stressed plants in field images, achieving an average Dice coefficient of 0.74.
arXiv Detail & Related papers (2021-06-14T21:57:40Z)
Generating Physically-Consistent Satellite Imagery for Climate Visualizations [53.61991820941501]
We train a generative adversarial network to create synthetic satellite imagery of future flooding and reforestation events. A pure deep learning-based model can generate flood visualizations but hallucinates floods at locations that were not susceptible to flooding. We publish our code and dataset for segmentation guided image-to-image translation in Earth observation.
arXiv Detail & Related papers (2021-04-10T15:00:15Z)
Deep Low-Shot Learning for Biological Image Classification and Visualization from Limited Training Samples [52.549928980694695]
In situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared. labeling training data with precise stages is very time-consuming even for biologists. We propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images.
arXiv Detail & Related papers (2020-10-20T06:06:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.