A Comparative Study of 3D Model Acquisition Methods for Synthetic Data Generation of Agricultural Products
- URL: http://arxiv.org/abs/2601.03784v1
- Date: Wed, 07 Jan 2026 10:34:26 GMT
- Title: A Comparative Study of 3D Model Acquisition Methods for Synthetic Data Generation of Agricultural Products
- Authors: Steven Moonen, Rob Salaets, Kenneth Batstone, Abdellatif Bey-Temsamani, Nick Michiels,
- Abstract summary: In the manufacturing industry, computer vision systems based on artificial intelligence (AI) are widely used to reduce costs and increase production.<n>Training these AI models requires a large amount of training data that is costly to acquire and annotate.<n>A popular approach to reduce the need for real data is the use of synthetic data that is generated by leveraging computer-aided design (CAD) models available in the industry.
- Score: 0.8373057326694192
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the manufacturing industry, computer vision systems based on artificial intelligence (AI) are widely used to reduce costs and increase production. Training these AI models requires a large amount of training data that is costly to acquire and annotate, especially in high-variance, low-volume manufacturing environments. A popular approach to reduce the need for real data is the use of synthetic data that is generated by leveraging computer-aided design (CAD) models available in the industry. However, in the agricultural industry these models are not readily available, increasing the difficulty in leveraging synthetic data. In this paper, we present different techniques for substituting CAD files to create synthetic datasets. We measure their relative performance when used to train an AI object detection model to separate stones and potatoes in a bin picking environment. We demonstrate that using highly representative 3D models acquired by scanning or using image-to-3D approaches can be used to generate synthetic data for training object detection models. Finetuning on a small real dataset can significantly improve the performance of the models and even get similar performance when less representative models are used.
Related papers
- The Impact of Synthetic Data on Object Detection Model Performance: A Comparative Analysis with Real-World Data [1.853053680967785]
This work investigates the impact of synthetic data on the performance of object detection models, compared to models trained on real-world data only.<n>It comprises experiments focused on pallet detection in a warehouse setting, utilizing both real and various synthetic dataset generation strategies.
arXiv Detail & Related papers (2025-10-14T06:59:51Z) - Scaling Transformer-Based Novel View Synthesis Models with Token Disentanglement and Synthetic Data [53.040873127309766]
We propose a token disentanglement process within the transformer architecture, enhancing feature separation and ensuring more effective learning.<n>Our method outperforms existing models on both in-dataset and cross-dataset evaluations.
arXiv Detail & Related papers (2025-09-08T17:58:06Z) - Scaling Laws of Synthetic Data for Language Models [125.41600201811417]
We introduce SynthLLM, a scalable framework that transforms pre-training corpora into diverse, high-quality synthetic datasets.<n>Our approach achieves this by automatically extracting and recombining high-level concepts across multiple documents using a graph algorithm.
arXiv Detail & Related papers (2025-03-25T11:07:12Z) - Little Giants: Synthesizing High-Quality Embedding Data at Scale [71.352883755806]
We introduce SPEED, a framework that aligns open-source small models to efficiently generate large-scale embedding data.
SPEED uses only less than 1/10 of the GPT API calls, outperforming the state-of-the-art embedding model E5_mistral when both are trained solely on their synthetic data.
arXiv Detail & Related papers (2024-10-24T10:47:30Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Enhancing Object Detection Performance for Small Objects through
Synthetic Data Generation and Proportional Class-Balancing Technique: A
Comparative Study in Industrial Scenarios [0.0]
Object Detection (OD) has proven to be a significant computer vision method in extracting localized class information.
Many of the state-of-the-art (SOTA) OD models perform well on medium and large sized objects, but under perform on small objects.
This study presents a novel approach that injects additional data points to improve the performance of the OD models.
arXiv Detail & Related papers (2024-01-23T13:02:11Z) - On the Stability of Iterative Retraining of Generative Models on their own Data [56.153542044045224]
We study the impact of training generative models on mixed datasets.
We first prove the stability of iterative training under the condition that the initial generative models approximate the data distribution well enough.
We empirically validate our theory on both synthetic and natural images by iteratively training normalizing flows and state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-09-30T16:41:04Z) - Robust Category-Level 3D Pose Estimation from Synthetic Data [17.247607850702558]
We introduce SyntheticP3D, a new synthetic dataset for object pose estimation generated from CAD models.
We propose a novel approach (CC3D) for training neural mesh models that perform pose estimation via inverse rendering.
arXiv Detail & Related papers (2023-05-25T14:56:03Z) - Synthetic Image Data for Deep Learning [0.294944680995069]
Realistic synthetic image data rendered from 3D models can be used to augment image sets and train image classification semantic segmentation models.
We show how high quality physically-based rendering and domain randomization can efficiently create a large synthetic dataset based on production 3D CAD models of a real vehicle.
arXiv Detail & Related papers (2022-12-12T20:28:13Z) - Is synthetic data from generative models ready for image recognition? [69.42645602062024]
We study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks.
We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks.
arXiv Detail & Related papers (2022-10-14T06:54:24Z) - Synthetic Data and Hierarchical Object Detection in Overhead Imagery [0.0]
We develop novel synthetic data generation and augmentation techniques for enhancing low/zero-sample learning in satellite imagery.
To test the effectiveness of synthetic imagery, we employ it in the training of detection models and our two stage model, and evaluate the resulting models on real satellite images.
arXiv Detail & Related papers (2021-01-29T22:52:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.