Related papers: Towards Sim-to-Real Industrial Parts Classification with Synthetic Dataset

Towards Sim-to-Real Industrial Parts Classification with Synthetic Dataset

URL: http://arxiv.org/abs/2404.08778v1
Date: Fri, 12 Apr 2024 19:04:59 GMT
Title: Towards Sim-to-Real Industrial Parts Classification with Synthetic Dataset
Authors: Xiaomeng Zhu, Talha Bilal, Pär Mårtensson, Lars Hanson, Mårten Björkman, Atsuto Maki,
Abstract summary: We introduce a synthetic dataset that may serve as a preliminary testbed for the Sim-to-Real challenge. It contains 17 objects of six industrial use cases, including isolated and assembled parts. All the sample images come with and without random backgrounds and post-processing for evaluating the importance of domain randomization.
Score: 6.481744951262474
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper is about effectively utilizing synthetic data for training deep neural networks for industrial parts classification, in particular, by taking into account the domain gap against real-world images. To this end, we introduce a synthetic dataset that may serve as a preliminary testbed for the Sim-to-Real challenge; it contains 17 objects of six industrial use cases, including isolated and assembled parts. A few subsets of objects exhibit large similarities in shape and albedo for reflecting challenging cases of industrial parts. All the sample images come with and without random backgrounds and post-processing for evaluating the importance of domain randomization. We call it Synthetic Industrial Parts dataset (SIP-17). We study the usefulness of SIP-17 through benchmarking the performance of five state-of-the-art deep network models, supervised and self-supervised, trained only on the synthetic data while testing them on real data. By analyzing the results, we deduce some insights on the feasibility and challenges of using synthetic data for industrial parts classification and for further developing larger-scale synthetic datasets. Our dataset and code are publicly available.

Related papers

Domain Randomization for Object Detection in Manufacturing Applications using Synthetic Data: A Comprehensive Study [6.172233837488904]
This paper addresses key aspects of domain randomization in generating synthetic data for manufacturing object detection applications.<n>We present a comprehensive data generation pipeline that reflects different factors: object characteristics, background, illumination, camera settings, and post-processing.<n>In our experiments, we present more abundant results and insights into the feasibility as well as challenges of sim-to-real object detection.
arXiv Detail & Related papers (2025-06-09T08:26:19Z)
Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map [50.21082069320818]
We propose a novel diffusion-based pipeline for generating high-fidelity industrial datasets with minimal supervision.<n>Our approach conditions the diffusion model on enriched bounding box representations to produce precise segmentation masks.<n>Results demonstrate that diffusion-based synthesis can bridge the gap between artificial and real-world industrial data.
arXiv Detail & Related papers (2025-05-06T15:21:36Z)
An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval [51.10419281315848]
We conduct an empirical study to explore the potential of synthetic data for Text-Based Person Retrieval (TBPR) research. We propose an inter-class image generation pipeline, in which an automatic prompt construction strategy is introduced. We develop an intra-class image augmentation pipeline, in which the generative AI models are applied to further edit the images.
arXiv Detail & Related papers (2025-03-28T06:18:15Z)
Data-Constrained Synthesis of Training Data for De-Identification [0.0]
We domain-adapt large language models (LLMs) to the clinical domain. We generate synthetic clinical texts that are machine-annotated with tags for personally identifiable information. The synthetic corpora are then used to train synthetic NER models.
arXiv Detail & Related papers (2025-02-20T16:09:27Z)
Exploring the Potential of Synthetic Data to Replace Real Data [16.89582896061033]
We find that the potential of synthetic data to replace real data varies depending on the number of cross-domain real images and the test set on which the trained model is evaluated. We introduce two new metrics, the train2test distance and $textAP_textt2t$, to evaluate the ability of a cross-domain training set using synthetic data.
arXiv Detail & Related papers (2024-08-26T18:20:18Z)
Massively Annotated Datasets for Assessment of Synthetic and Real Data in Face Recognition [0.2775636978045794]
We study the drift between the performance of models trained on real and synthetic datasets. We conduct studies on the differences between real and synthetic datasets on the attribute set. Interestingly enough, we have verified that while real samples suffice to explain the synthetic distribution, the opposite could not be further from being true.
arXiv Detail & Related papers (2024-04-23T17:10:49Z)
Reliability in Semantic Segmentation: Can We Use Synthetic Data? [69.28268603137546]
We show for the first time how synthetic data can be specifically generated to assess comprehensively the real-world reliability of semantic segmentation models. This synthetic data is employed to evaluate the robustness of pretrained segmenters. We demonstrate how our approach can be utilized to enhance the calibration and OOD detection capabilities of segmenters.
arXiv Detail & Related papers (2023-12-14T18:56:07Z)
Synthetic Data Generation for Bridging Sim2Real Gap in a Production Environment [0.0]
Domain knowledge is vital in bridging the simulation to reality gap in computer vision applications. This paper focuses on synthetic data generation procedures for parts and assemblies used in a production environment.
arXiv Detail & Related papers (2023-11-18T11:15:08Z)
Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A Comprehensive Benchmark [56.8042116967334]
Synthetic data serves as an alternative in training machine learning models. ensuring that synthetic data mirrors the complex nuances of real-world data is a challenging task. This paper explores the potential of integrating data-centric AI techniques to guide the synthetic data generation process.
arXiv Detail & Related papers (2023-10-25T20:32:02Z)
Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances [76.34037366117234]
We introduce a new dataset called Robot Control Gestures (RoCoG-v2) The dataset is composed of both real and synthetic videos from seven gesture classes. We present results using state-of-the-art action recognition and domain adaptation algorithms.
arXiv Detail & Related papers (2023-03-17T23:23:55Z)
Synthetic Data for Object Classification in Industrial Applications [53.180678723280145]
In object classification, capturing a large number of images per object and in different conditions is not always possible. This work explores the creation of artificial images using a game engine to cope with limited data in the training dataset.
arXiv Detail & Related papers (2022-12-09T11:43:04Z)
Analysis of Training Object Detection Models with Synthetic Data [0.0]
This paper attempts to provide a holistic overview of how to use synthetic data for object detection. We analyse aspects of generating the data as well as techniques used to train the models. Experiments are validated on real data and benchmarked to models trained on real data.
arXiv Detail & Related papers (2022-11-29T10:21:16Z)
Deep Learning based pipeline for anomaly detection and quality enhancement in industrial binder jetting processes [68.8204255655161]
Anomaly detection describes methods of finding abnormal states, instances or data points that differ from a normal value space. This paper contributes to a data-centric way of approaching artificial intelligence in industrial production.
arXiv Detail & Related papers (2022-09-21T08:14:34Z)
TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets. We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z)
Segmenting Unseen Industrial Components in a Heavy Clutter Using RGB-D Fusion and Synthetic Data [0.4724825031148411]
Industrial components are texture-less, reflective, and often found in cluttered and unstructured environments. We present a synthetic data generation pipeline that randomizes textures via domain randomization to focus on the shape information. We also propose an RGB-D Fusion Mask R-CNN with a confidence map estimator, which exploits reliable depth information in multiple feature levels.
arXiv Detail & Related papers (2020-02-10T02:33:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.