Modified CycleGAN for the synthesization of samples for wheat head
segmentation
- URL: http://arxiv.org/abs/2402.15135v1
- Date: Fri, 23 Feb 2024 06:42:58 GMT
- Title: Modified CycleGAN for the synthesization of samples for wheat head
segmentation
- Authors: Jaden Myers, Keyhan Najafian, Farhad Maleki, and Katie Ovens
- Abstract summary: In the absence of an annotated dataset, synthetic data can be used for model development.
We develop a realistic annotated synthetic dataset for wheat head segmentation.
The resulting model achieved a Dice score of 83.4% on an internal dataset and 83.6% on two external Global Wheat Head Detection datasets.
- Score: 0.09999629695552192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models have been used for a variety of image processing tasks.
However, most of these models are developed through supervised learning
approaches, which rely heavily on the availability of large-scale annotated
datasets. Developing such datasets is tedious and expensive. In the absence of
an annotated dataset, synthetic data can be used for model development;
however, due to the substantial differences between simulated and real data, a
phenomenon referred to as domain gap, the resulting models often underperform
when applied to real data. In this research, we aim to address this challenge
by first computationally simulating a large-scale annotated dataset and then
using a generative adversarial network (GAN) to fill the gap between simulated
and real images. This approach results in a synthetic dataset that can be
effectively utilized to train a deep-learning model. Using this approach, we
developed a realistic annotated synthetic dataset for wheat head segmentation.
This dataset was then used to develop a deep-learning model for semantic
segmentation. The resulting model achieved a Dice score of 83.4\% on an
internal dataset and Dice scores of 79.6% and 83.6% on two external Global
Wheat Head Detection datasets. While we proposed this approach in the context
of wheat head segmentation, it can be generalized to other crop types or, more
broadly, to images with dense, repeated patterns such as those found in
cellular imagery.
Related papers
- Diffusion Models as Data Mining Tools [87.77999285241219]
This paper demonstrates how to use generative models trained for image synthesis as tools for visual data mining.
We show that after finetuning conditional diffusion models to synthesize images from a specific dataset, we can use these models to define a typicality measure.
This measure assesses how typical visual elements are for different data labels, such as geographic location, time stamps, semantic labels, or even the presence of a disease.
arXiv Detail & Related papers (2024-07-20T17:14:31Z) - Generative Expansion of Small Datasets: An Expansive Graph Approach [13.053285552524052]
We introduce an Expansive Synthesis model generating large-scale, information-rich datasets from minimal samples.
An autoencoder with self-attention layers and optimal transport refines distributional consistency.
Results show comparable performance, demonstrating the model's potential to augment training data effectively.
arXiv Detail & Related papers (2024-06-25T02:59:02Z) - Domain-Transferred Synthetic Data Generation for Improving Monocular Depth Estimation [9.812476193015488]
We propose a method of data generation in simulation using 3D synthetic environments and CycleGAN domain transfer.
We compare this method of data generation to the popular NYUDepth V2 dataset by training a depth estimation model based on the DenseDepth structure using different training sets of real and simulated data.
We evaluate the performance of the models on newly collected images and LiDAR depth data from a Husky robot to verify the generalizability of the approach and show that GAN-transformed data can serve as an effective alternative to real-world data, particularly in depth estimation.
arXiv Detail & Related papers (2024-05-02T09:21:10Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion
Models [61.906934570771256]
We present a generic dataset generation model that can produce diverse synthetic images and perception annotations.
Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation.
We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module.
arXiv Detail & Related papers (2023-08-11T14:38:11Z) - RADiff: Controllable Diffusion Models for Radio Astronomical Maps
Generation [6.128112213696457]
RADiff is a generative approach based on conditional diffusion models trained over an annotated radio dataset.
We show that it is possible to generate fully-synthetic image-annotation pairs to automatically augment any annotated dataset.
arXiv Detail & Related papers (2023-07-05T16:04:44Z) - Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards [68.95806641664713]
We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection.
We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset.
Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
arXiv Detail & Related papers (2023-06-20T09:46:01Z) - The Big Data Myth: Using Diffusion Models for Dataset Generation to
Train Deep Detection Models [0.15469452301122172]
This study presents a framework for the generation of synthetic datasets by fine-tuning stable diffusion models.
The results of this study reveal that the object detection models trained on synthetic data perform similarly to the baseline model.
arXiv Detail & Related papers (2023-06-16T10:48:52Z) - Data from Model: Extracting Data from Non-robust and Robust Models [83.60161052867534]
This work explores the reverse process of generating data from a model, attempting to reveal the relationship between the data and the model.
We repeat the process of Data to Model (DtM) and Data from Model (DfM) in sequence and explore the loss of feature mapping information.
Our results show that the accuracy drop is limited even after multiple sequences of DtM and DfM, especially for robust models.
arXiv Detail & Related papers (2020-07-13T05:27:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.