Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards
- URL: http://arxiv.org/abs/2306.11763v1
- Date: Tue, 20 Jun 2023 09:46:01 GMT
- Title: Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards
- Authors: Alexander van Meekeren, Maya Aghaei, Klaas Dijkstra
- Abstract summary: We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection.
We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset.
Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
- Score: 68.95806641664713
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep object detection models have achieved notable successes in recent years,
but one major obstacle remains: the requirement for a large amount of training
data. Obtaining such data is a tedious process and is mainly time consuming,
leading to the exploration of new research avenues like synthetic data
generation techniques. In this study, we explore the usability of Stable
Diffusion 2.1-base for generating synthetic datasets of apple trees for object
detection and compare it to a baseline model trained on real-world data. After
creating a dataset of realistic apple trees with prompt engineering and
utilizing a previously trained Stable Diffusion model, the custom dataset was
annotated and evaluated by training a YOLOv5m object detection model to predict
apples in a real-world apple detection dataset. YOLOv5m was chosen for its
rapid inference time and minimal hardware demands. Results demonstrate that the
model trained on generated data is slightly underperforming compared to a
baseline model trained on real-world images when evaluated on a set of
real-world images. However, these findings remain highly promising, as the
average precision difference is only 0.09 and 0.06, respectively. Qualitative
results indicate that the model can accurately predict the location of apples,
except in cases of heavy shading. These findings illustrate the potential of
synthetic data generation techniques as a viable alternative to the collection
of extensive training data for object detection models.
Related papers
- Forte : Finding Outliers with Representation Typicality Estimation [0.14061979259370275]
Generative models can now produce synthetic data which is virtually indistinguishable from the real data used to train it.
Recent work on OOD detection has raised doubts that generative model likelihoods are optimal OOD detectors.
We introduce a novel approach that leverages representation learning, and informative summary statistics based on manifold estimation.
arXiv Detail & Related papers (2024-10-02T08:26:37Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Foundation Models for Generalist Geospatial Artificial Intelligence [3.7002058945990415]
This paper introduces a first-of-a-kind framework for the efficient pre-training and fine-tuning of foundational models on extensive data.
We have utilized this framework to create Prithvi, a transformer-based foundational model pre-trained on more than 1TB of multispectral satellite imagery.
arXiv Detail & Related papers (2023-10-28T10:19:55Z) - The Big Data Myth: Using Diffusion Models for Dataset Generation to
Train Deep Detection Models [0.15469452301122172]
This study presents a framework for the generation of synthetic datasets by fine-tuning stable diffusion models.
The results of this study reveal that the object detection models trained on synthetic data perform similarly to the baseline model.
arXiv Detail & Related papers (2023-06-16T10:48:52Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - EARLIN: Early Out-of-Distribution Detection for Resource-efficient
Collaborative Inference [4.826988182025783]
Collaborative inference enables resource-constrained edge devices to make inferences by uploading inputs to a server.
While this setup works cost-effectively for successful inferences, it severely underperforms when the model faces input samples on which the model was not trained.
We propose a novel lightweight OOD detection approach that mines important features from the shallow layers of a pretrained CNN model.
arXiv Detail & Related papers (2021-06-25T18:43:23Z) - Contemplating real-world object classification [53.10151901863263]
We reanalyze the ObjectNet dataset recently proposed by Barbu et al. containing objects in daily life situations.
We find that applying deep models to the isolated objects, rather than the entire scene as is done in the original paper, results in around 20-30% performance improvement.
arXiv Detail & Related papers (2021-03-08T23:29:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.