Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection
- URL: http://arxiv.org/abs/2507.13221v1
- Date: Thu, 17 Jul 2025 15:35:27 GMT
- Title: Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection
- Authors: Hongyang Zhao, Tianyu Liang, Sina Davari, Daeho Kim,
- Abstract summary: This study presents a novel image synthesis methodology tailored for construction worker detection.<n>The approach entails generating a collection of 12,000 synthetic images by formulating 3000 different prompts.<n> Evaluation on a real construction image dataset yielded promising results.
- Score: 0.3011426942929757
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While recent advancements in deep neural networks (DNNs) have substantially enhanced visual AI's capabilities, the challenge of inadequate data diversity and volume remains, particularly in construction domain. This study presents a novel image synthesis methodology tailored for construction worker detection, leveraging the generative-AI platform Midjourney. The approach entails generating a collection of 12,000 synthetic images by formulating 3000 different prompts, with an emphasis on image realism and diversity. These images, after manual labeling, serve as a dataset for DNN training. Evaluation on a real construction image dataset yielded promising results, with the model attaining average precisions (APs) of 0.937 and 0.642 at intersection-over-union (IoU) thresholds of 0.5 and 0.5 to 0.95, respectively. Notably, the model demonstrated near-perfect performance on the synthetic dataset, achieving APs of 0.994 and 0.919 at the two mentioned thresholds. These findings reveal both the potential and weakness of generative AI in addressing DNN training data scarcity.
Related papers
- Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data [0.0]
Zenesis is a comprehensive no-code interactive platform designed to minimize barriers posed by data readiness for scientific images.<n>We develop lightweight multi-modal adaptation techniques that enable zero-shot operation on raw scientific data.<n>Our results demonstrate that Zenesis is a powerful tool for scientific applications, particularly in fields where high-quality annotated datasets are unavailable.
arXiv Detail & Related papers (2025-06-30T16:45:23Z) - Stylized Structural Patterns for Improved Neural Network Pre-training [1.8641315013048299]
Deep learning models in computer vision require large datasets of real images, which are difficult to curate and pose privacy and legal concerns.<n>Recent works suggest synthetic data as an alternative, yet models trained with it often underperform.<n>We propose an improved neural fractal formulation through which we introduce a new class of synthetic data.<n>Second, we propose reverse stylization, a technique that transfers visual features from a small, license-free set of real images onto synthetic datasets.
arXiv Detail & Related papers (2025-06-24T09:47:31Z) - Dataset Distillation with Probabilistic Latent Features [9.318549327568695]
A compact set of synthetic data can effectively replace the original dataset in downstream classification tasks.<n>We propose a novel approach that models the joint distribution of latent features.<n>Our method achieves state-of-the-art cross architecture performance across a range of backbone architectures.
arXiv Detail & Related papers (2025-05-10T13:53:49Z) - CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI [58.35348718345307]
Current efforts to distinguish between real and AI-generated images may lack generalization.<n>We propose a novel framework, Co-Spy, that first enhances existing semantic features.<n>We also create Co-Spy-Bench, a comprehensive dataset comprising 5 real image datasets and 22 state-of-the-art generative models.
arXiv Detail & Related papers (2025-03-24T01:59:29Z) - Improving Object Detection by Modifying Synthetic Data with Explainable AI [3.0519884745675485]
We propose a novel conceptual approach to improve the efficiency of designing synthetic images.<n>XAI techniques guide a human-in-the-loop process of modifying 3D mesh models used to generate these images.<n>We show that synthetic data can improve detection of vehicles in orientations unseen in training by 4.6%.
arXiv Detail & Related papers (2024-12-02T13:24:43Z) - Merging synthetic and real embryo data for advanced AI predictions [69.07284335967019]
We train two generative models using two datasets-one we created and made publicly available, and one existing public dataset-to generate synthetic embryo images at various cell stages.<n>These were combined with real images to train classification models for embryo cell stage prediction.<n>Our results demonstrate that incorporating synthetic images alongside real data improved classification performance, with the model achieving 97% accuracy compared to 94.5% when trained solely on real data.
arXiv Detail & Related papers (2024-12-02T08:24:49Z) - Zero-Shot Detection of AI-Generated Images [54.01282123570917]
We propose a zero-shot entropy-based detector (ZED) to detect AI-generated images.
Inspired by recent works on machine-generated text detection, our idea is to measure how surprising the image under analysis is compared to a model of real images.
ZED achieves an average improvement of more than 3% over the SoTA in terms of accuracy.
arXiv Detail & Related papers (2024-09-24T08:46:13Z) - DeepDC: Deep Distance Correlation as a Perceptual Image Quality
Evaluator [53.57431705309919]
ImageNet pre-trained deep neural networks (DNNs) show notable transferability for building effective image quality assessment (IQA) models.
We develop a novel full-reference IQA (FR-IQA) model based exclusively on pre-trained DNN features.
We conduct comprehensive experiments to demonstrate the superiority of the proposed quality model on five standard IQA datasets.
arXiv Detail & Related papers (2022-11-09T14:57:27Z) - Is synthetic data from generative models ready for image recognition? [69.42645602062024]
We study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks.
We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks.
arXiv Detail & Related papers (2022-10-14T06:54:24Z) - ProcTHOR: Large-Scale Embodied AI Using Procedural Generation [55.485985317538194]
ProcTHOR is a framework for procedural generation of Embodied AI environments.
We demonstrate state-of-the-art results across 6 embodied AI benchmarks for navigation, rearrangement, and arm manipulation.
arXiv Detail & Related papers (2022-06-14T17:09:35Z) - Simple and Effective Synthesis of Indoor 3D Scenes [78.95697556834536]
We study the problem of immersive 3D indoor scenes from one or more images.
Our aim is to generate high-resolution images and videos from novel viewpoints.
We propose an image-to-image GAN that maps directly from reprojections of incomplete point clouds to full high-resolution RGB-D images.
arXiv Detail & Related papers (2022-04-06T17:54:46Z) - Synthetic Data and Hierarchical Object Detection in Overhead Imagery [0.0]
We develop novel synthetic data generation and augmentation techniques for enhancing low/zero-sample learning in satellite imagery.
To test the effectiveness of synthetic imagery, we employ it in the training of detection models and our two stage model, and evaluate the resulting models on real satellite images.
arXiv Detail & Related papers (2021-01-29T22:52:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.