Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map
- URL: http://arxiv.org/abs/2505.03623v1
- Date: Tue, 06 May 2025 15:21:36 GMT
- Title: Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map
- Authors: Alessandro Simoni, Francesco Pelosin,
- Abstract summary: We propose a novel diffusion-based pipeline for generating high-fidelity industrial datasets with minimal supervision.<n>Our approach conditions the diffusion model on enriched bounding box representations to produce precise segmentation masks.<n>Results demonstrate that diffusion-based synthesis can bridge the gap between artificial and real-world industrial data.
- Score: 50.21082069320818
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Synthetic dataset generation in Computer Vision, particularly for industrial applications, is still underexplored. Industrial defect segmentation, for instance, requires highly accurate labels, yet acquiring such data is costly and time-consuming. To address this challenge, we propose a novel diffusion-based pipeline for generating high-fidelity industrial datasets with minimal supervision. Our approach conditions the diffusion model on enriched bounding box representations to produce precise segmentation masks, ensuring realistic and accurately localized defect synthesis. Compared to existing layout-conditioned generative methods, our approach improves defect consistency and spatial accuracy. We introduce two quantitative metrics to evaluate the effectiveness of our method and assess its impact on a downstream segmentation task trained on real and synthetic data. Our results demonstrate that diffusion-based synthesis can bridge the gap between artificial and real-world industrial data, fostering more reliable and cost-efficient segmentation models. The code is publicly available at https://github.com/covisionlab/diffusion_labeling.
Related papers
- ISP-AD: A Large-Scale Real-World Dataset for Advancing Industrial Anomaly Detection with Synthetic and Real Defects [0.0]
Industrial Screen Printing Anomaly Detection dataset (ISP-AD)<n>ISP-AD is the largest publicly available industrial dataset to date, including both synthetic and real defects collected directly from the factory floor.<n>Experiments on a mixed supervised training approach, incorporating both synthesized and real defects, were conducted.<n>Research findings indicate that supervision by means of both synthetic and accumulated real defects can complement each other, meeting demanded industrial inspection requirements such as low false positive rates and high recall.
arXiv Detail & Related papers (2025-03-06T21:56:31Z) - DreamDA: Generative Data Augmentation with Diffusion Models [68.22440150419003]
This paper proposes a new classification-oriented framework DreamDA.
DreamDA generates diverse samples that adhere to the original data distribution by considering training images in the original data as seeds.
In addition, since the labels of the generated data may not align with the labels of their corresponding seed images, we introduce a self-training paradigm for generating pseudo labels.
arXiv Detail & Related papers (2024-03-19T15:04:35Z) - Investigation of the Impact of Synthetic Training Data in the Industrial
Application of Terminal Strip Object Detection [4.327763441385371]
In this paper, we investigate the sim-to-real generalization performance of standard object detectors on the complex industrial application of terminal strip object detection.
We manually annotated 300 real images of terminal strips for the evaluation. The results show the cruciality of the objects of interest to have the same scale in either domain.
arXiv Detail & Related papers (2024-03-06T18:33:27Z) - Controllable Image Synthesis of Industrial Data Using Stable Diffusion [2.021800129069459]
We propose a new approach for reusing general-purpose pre-trained generative models on industrial data.
First, we let the model learn the new concept, entailing the novel data distribution.
Then, we force it to learn to condition the generative process, producing industrial images that satisfy well-defined topological characteristics.
arXiv Detail & Related papers (2024-01-06T08:09:24Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - VisDA 2022 Challenge: Domain Adaptation for Industrial Waste Sorting [61.52419223232737]
In industrial waste sorting, one of the biggest challenges is the extreme diversity of the input stream.
We present the VisDA 2022 Challenge on Domain Adaptation for Industrial Waste Sorting.
arXiv Detail & Related papers (2023-03-26T21:38:38Z) - Deep Learning based pipeline for anomaly detection and quality
enhancement in industrial binder jetting processes [68.8204255655161]
Anomaly detection describes methods of finding abnormal states, instances or data points that differ from a normal value space.
This paper contributes to a data-centric way of approaching artificial intelligence in industrial production.
arXiv Detail & Related papers (2022-09-21T08:14:34Z) - Unsupervised Domain Adaptive Salient Object Detection Through
Uncertainty-Aware Pseudo-Label Learning [104.00026716576546]
We propose to learn saliency from synthetic but clean labels, which naturally has higher pixel-labeling quality without the effort of manual annotations.
We show that our proposed method outperforms the existing state-of-the-art deep unsupervised SOD methods on several benchmark datasets.
arXiv Detail & Related papers (2022-02-26T16:03:55Z) - Segmenting Unseen Industrial Components in a Heavy Clutter Using RGB-D
Fusion and Synthetic Data [0.4724825031148411]
Industrial components are texture-less, reflective, and often found in cluttered and unstructured environments.
We present a synthetic data generation pipeline that randomizes textures via domain randomization to focus on the shape information.
We also propose an RGB-D Fusion Mask R-CNN with a confidence map estimator, which exploits reliable depth information in multiple feature levels.
arXiv Detail & Related papers (2020-02-10T02:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.