A Synthetic Data Pipeline for Supporting Manufacturing SMEs in Visual Assembly Control
- URL: http://arxiv.org/abs/2509.13089v1
- Date: Tue, 16 Sep 2025 13:48:55 GMT
- Title: A Synthetic Data Pipeline for Supporting Manufacturing SMEs in Visual Assembly Control
- Authors: Jonas Werheid, Shengjie He, Aymen Gannouni, Anas Abdelrazeq, Robert H. Schmitt,
- Abstract summary: We present a novel approach for easily integrable and data-efficient visual assembly control.<n>Our approach leverages simulated scene generation based on computer-aided design (CAD) data and object detection algorithms.<n>The results demonstrate a time-saving pipeline for generating image data in manufacturing environments.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Quality control of assembly processes is essential in manufacturing to ensure not only the quality of individual components but also their proper integration into the final product. To assist in this matter, automated assembly control using computer vision methods has been widely implemented. However, the costs associated with image acquisition, annotation, and training of computer vision algorithms pose challenges for integration, especially for small- and medium-sized enterprises (SMEs), which often lack the resources for extensive training, data collection, and manual image annotation. Synthetic data offers the potential to reduce manual data collection and labeling. Nevertheless, its practical application in the context of assembly quality remains limited. In this work, we present a novel approach for easily integrable and data-efficient visual assembly control. Our approach leverages simulated scene generation based on computer-aided design (CAD) data and object detection algorithms. The results demonstrate a time-saving pipeline for generating image data in manufacturing environments, achieving a mean Average Precision (mAP@0.5:0.95) up to 99,5% for correctly identifying instances of synthetic planetary gear system components within our simulated training data, and up to 93% when transferred to real-world camera-captured testing data. This research highlights the effectiveness of synthetic data generation within an adaptable pipeline and underscores its potential to support SMEs in implementing resource-efficient visual assembly control solutions.
Related papers
- Data Science and Technology Towards AGI Part I: Tiered Data Management [53.64581824953229]
We argue that the development of artificial intelligence is entering a new phase of data-model co-evolution.<n>We introduce an L0-L4 tiered data management framework, ranging from raw uncurated resources to organized and verifiable knowledge.<n>We validate the effectiveness of the proposed framework through empirical studies.
arXiv Detail & Related papers (2026-02-09T18:47:51Z) - Hybrid Synthetic Data Generation with Domain Randomization Enables Zero-Shot Vision-Based Part Inspection Under Extreme Class Imbalance [3.7696918637188817]
Training robust machine learning models requires large volumes of high-quality labeled data.<n>Defective samples are intrinsically rare, leading to severe class imbalance that degrades model performance.<n>Synthetic data generation offers a promising solution by enabling the creation of large, balanced, and fully annotated datasets.
arXiv Detail & Related papers (2025-11-28T05:30:49Z) - Design and Evaluation of a Scalable Data Pipeline for AI-Driven Air Quality Monitoring in Low-Resource Settings [0.4681310436826459]
This paper presents the design, implementation, and evaluation of the AirQo data pipeline.<n>It is built using open-source technologies such as Apache Airflow, Apache Kafka, and Google BigQuery.<n>We demonstrate the pipeline's ability to ingest, transform, and distribute millions of air quality measurements monthly from over 400 monitoring devices.
arXiv Detail & Related papers (2025-08-20T06:19:27Z) - Provenance Tracking in Large-Scale Machine Learning Systems [0.0]
y4ML is a tool designed to collect data in a format compliant with the W3C PROV and ProvProvML standards.<n>y4ML is fully integrated with the yProv framework, allowing for higher level pairing in tasks run also through workflow management systems.
arXiv Detail & Related papers (2025-07-01T14:10:02Z) - Advanced Clustering Framework for Semiconductor Image Analytics Integrating Deep TDA with Self-Supervised and Transfer Learning Techniques [1.03121181235382]
This paper introduces an advanced clustering framework that integrates deep Topological Data Analysis (TDA) with self-supervised and transfer learning techniques.<n>The framework successfully identifies clusters aligned with defect patterns and process variations.
arXiv Detail & Related papers (2025-05-05T17:53:03Z) - OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis [55.390060529534644]
We propose OS-Genesis, a novel data synthesis pipeline for Graphical User Interface (GUI) agents.<n>Instead of relying on pre-defined tasks, OS-Genesis enables agents first to perceive environments and perform step-wise interactions.<n>We demonstrate that training GUI agents with OS-Genesis significantly improves their performance on highly challenging online benchmarks.
arXiv Detail & Related papers (2024-12-27T16:21:58Z) - Unsupervised Multimodal Fusion of In-process Sensor Data for Advanced Manufacturing Process Monitoring [0.0]
This paper presents a novel approach to multimodal sensor data fusion in manufacturing processes.
We leverage contrastive learning techniques to correlate different data modalities without the need for labeled data.
Our approach facilitates downstream tasks such as process control, anomaly detection, and quality assurance.
arXiv Detail & Related papers (2024-10-29T21:52:04Z) - BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation [57.40024206484446]
We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models.
BVS supports a large number of adjustable parameters at the scene level.
We showcase three example application scenarios.
arXiv Detail & Related papers (2024-05-15T17:57:56Z) - A Systematic Review of Available Datasets in Additive Manufacturing [56.684125592242445]
In-situ monitoring incorporating visual and other sensor technologies allows the collection of extensive datasets during the Additive Manufacturing process.
These datasets have potential for determining the quality of the manufactured output and the detection of defects through the use of Machine Learning.
This systematic review investigates the availability of open image-based datasets originating from AM processes that align with a number of pre-defined selection criteria.
arXiv Detail & Related papers (2024-01-27T16:13:32Z) - Dataset Factory: A Toolchain For Generative Computer Vision Datasets [0.9013233848500058]
We propose a "dataset factory" that separates the storage and processing of samples from metadata.
This enables data-centric operations at scale for machine learning teams and individual researchers.
arXiv Detail & Related papers (2023-09-20T19:43:37Z) - STAR: Boosting Low-Resource Information Extraction by Structure-to-Text
Data Generation with Large Language Models [56.27786433792638]
STAR is a data generation method that leverages Large Language Models (LLMs) to synthesize data instances.
We design fine-grained step-by-step instructions to obtain the initial data instances.
Our experiments show that the data generated by STAR significantly improve the performance of low-resource event extraction and relation extraction tasks.
arXiv Detail & Related papers (2023-05-24T12:15:19Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.