Stylized Structural Patterns for Improved Neural Network Pre-training
- URL: http://arxiv.org/abs/2506.19465v1
- Date: Tue, 24 Jun 2025 09:47:31 GMT
- Title: Stylized Structural Patterns for Improved Neural Network Pre-training
- Authors: Farnood Salehi, Vandit Sharma, Amirhossein Askari Farsangi, Tunç Ozan Aydın,
- Abstract summary: Deep learning models in computer vision require large datasets of real images, which are difficult to curate and pose privacy and legal concerns.<n>Recent works suggest synthetic data as an alternative, yet models trained with it often underperform.<n>We propose an improved neural fractal formulation through which we introduce a new class of synthetic data.<n>Second, we propose reverse stylization, a technique that transfers visual features from a small, license-free set of real images onto synthetic datasets.
- Score: 1.8641315013048299
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern deep learning models in computer vision require large datasets of real images, which are difficult to curate and pose privacy and legal concerns, limiting their commercial use. Recent works suggest synthetic data as an alternative, yet models trained with it often underperform. This paper proposes a two-step approach to bridge this gap. First, we propose an improved neural fractal formulation through which we introduce a new class of synthetic data. Second, we propose reverse stylization, a technique that transfers visual features from a small, license-free set of real images onto synthetic datasets, enhancing their effectiveness. We analyze the domain gap between our synthetic datasets and real images using Kernel Inception Distance (KID) and show that our method achieves a significantly lower distributional gap compared to existing synthetic datasets. Furthermore, our experiments across different tasks demonstrate the practical impact of this reduced gap. We show that pretraining the EDM2 diffusion model on our synthetic dataset leads to an 11% reduction in FID during image generation, compared to models trained on existing synthetic datasets, and a 20% decrease in autoencoder reconstruction error, indicating improved performance in data representation. Furthermore, a ViT-S model trained for classification on this synthetic data achieves over a 10% improvement in ImageNet-100 accuracy. Our work opens up exciting possibilities for training practical models when sufficiently large real training sets are not available.
Related papers
- LoFT: LoRA-fused Training Dataset Generation with Few-shot Guidance [96.6544564242316]
We introduce a novel dataset generation framework named LoFT, LoRA-Fused Training-data Generation with Few-shot Guidance.<n>Our method fine-tunes LoRA weights on individual real images and fuses them at inference time, producing synthetic images that combine the features of real images for improved diversity and fidelity of generated data.<n>Our experiments show that training on LoFT-generated data consistently outperforms other synthetic dataset methods, significantly increasing accuracy as the dataset size increases.
arXiv Detail & Related papers (2025-05-16T21:17:55Z) - Improving Object Detection by Modifying Synthetic Data with Explainable AI [3.0519884745675485]
We propose a novel conceptual approach to improve the efficiency of designing synthetic images.<n>XAI techniques guide a human-in-the-loop process of modifying 3D mesh models used to generate these images.<n>We show that synthetic data can improve detection of vehicles in orientations unseen in training by 4.6%.
arXiv Detail & Related papers (2024-12-02T13:24:43Z) - DataDream: Few-shot Guided Dataset Generation [90.09164461462365]
We propose a framework for synthesizing classification datasets that more faithfully represents the real data distribution.
DataDream fine-tunes LoRA weights for the image generation model on the few real images before generating the training data using the adapted model.
We then fine-tune LoRA weights for CLIP using the synthetic data to improve downstream image classification over previous approaches on a large variety of datasets.
arXiv Detail & Related papers (2024-07-15T17:10:31Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Scaling Laws of Synthetic Images for Model Training ... for Now [54.43596959598466]
We study the scaling laws of synthetic images generated by state of the art text-to-image models.
We observe that synthetic images demonstrate a scaling trend similar to, but slightly less effective than, real images in CLIP training.
arXiv Detail & Related papers (2023-12-07T18:59:59Z) - Improving the Effectiveness of Deep Generative Data [5.856292656853396]
Training a model on purely synthetic images for downstream image processing tasks results in an undesired performance drop compared to training on real data.
We propose a new taxonomy to describe factors contributing to this commonly observed phenomenon and investigate it on the popular CIFAR-10 dataset.
Our method outperforms baselines on downstream classification tasks both in case of training on synthetic only (Synthetic-to-Real) and training on a mix of real and synthetic data.
arXiv Detail & Related papers (2023-11-07T12:57:58Z) - Image Captions are Natural Prompts for Text-to-Image Models [53.529592120988]
It is challenging for text-to-image generative models to synthesize informative training data with hand-crafted prompts.<n>We propose a simple yet effective method, validated through ImageNet classification.<n>We show that this simple caption significantly boosts the informativeness of synthetic data.
arXiv Detail & Related papers (2023-07-17T14:38:11Z) - Synthetic Image Data for Deep Learning [0.294944680995069]
Realistic synthetic image data rendered from 3D models can be used to augment image sets and train image classification semantic segmentation models.
We show how high quality physically-based rendering and domain randomization can efficiently create a large synthetic dataset based on production 3D CAD models of a real vehicle.
arXiv Detail & Related papers (2022-12-12T20:28:13Z) - Is synthetic data from generative models ready for image recognition? [69.42645602062024]
We study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks.
We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks.
arXiv Detail & Related papers (2022-10-14T06:54:24Z) - Synthetic Data for Model Selection [2.4499092754102874]
We show that synthetic data can be beneficial for model selection.
We introduce a novel method to calibrate the synthetic error estimation to fit that of the real domain.
arXiv Detail & Related papers (2021-05-03T09:52:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.