Data-Efficient Brushstroke Generation with Diffusion Models for Oil Painting
- URL: http://arxiv.org/abs/2603.01103v1
- Date: Sun, 01 Mar 2026 13:42:35 GMT
- Title: Data-Efficient Brushstroke Generation with Diffusion Models for Oil Painting
- Authors: Dantong Qin, Alessandro Bozzon, Xian Yang, Xun Zhang, Yike Guo, Pan Wang,
- Abstract summary: We study the problem of learning human-like brushstroke generation from a small set of hand-drawn samples.<n>We propose StrokeDiff, a diffusion-based framework with Smooth Regularization (SmR)<n>We show how the learned primitives can be made controllable through a Bézier-based conditioning module.
- Score: 60.15416769662556
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many creative multimedia systems are built upon visual primitives such as strokes or textures, which are difficult to collect at scale and fundamentally different from natural image data. This data scarcity makes it challenging for modern generative models to learn expressive and controllable primitives, limiting their use in process-aware content creation. In this work, we study the problem of learning human-like brushstroke generation from a small set of hand-drawn samples (n=470) and propose StrokeDiff, a diffusion-based framework with Smooth Regularization (SmR). SmR injects stochastic visual priors during training, providing a simple mechanism to stabilize diffusion models under sparse supervision without altering the inference process. We further show how the learned primitives can be made controllable through a Bézier-based conditioning module and integrated into a complete stroke-based painting pipeline, including prediction, generation, ordering, and compositing. This demonstrates how data-efficient primitive modeling can support expressive and structured multimedia content creation. Experiments indicate that the proposed approach produces diverse and structurally coherent brushstrokes and enables paintings with richer texture and layering, validated by both automatic metrics and human evaluation.
Related papers
- Training Data Attribution for Image Generation using Ontology-Aligned Knowledge Graphs [3.686386213696443]
We introduce a framework for interpreting generative outputs through the automatic construction of knowledge graphs.<n>Our method extracts structured triples from images, aligned with a domain-specific ontology.<n>By comparing the KGs of generated and training images, we can trace potential influences, enabling copyright analysis, dataset transparency, and interpretable AI.
arXiv Detail & Related papers (2025-12-02T12:45:20Z) - Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model [118.52589065972795]
We introduce Muddit, a unified discrete diffusion transformer that enables fast and parallel generation across both text and image modalities.<n>Unlike prior unified diffusion models trained from scratch, Muddit integrates strong visual priors from a pretrained text-to-image backbone with a lightweight text decoder.
arXiv Detail & Related papers (2025-05-29T16:15:48Z) - Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation [54.588082888166504]
We present Mogao, a unified framework that enables interleaved multi-modal generation through a causal approach.<n>Mogoo integrates a set of key technical improvements in architecture design, including a deep-fusion design, dual vision encoders, interleaved rotary position embeddings, and multi-modal classifier-free guidance.<n>Experiments show that Mogao achieves state-of-the-art performance in multi-modal understanding and text-to-image generation, but also excels in producing high-quality, coherent interleaved outputs.
arXiv Detail & Related papers (2025-05-08T17:58:57Z) - Efficient Flow Matching using Latent Variables [9.363347684114474]
We show that $texttLatent-CFM$ exhibits improved generation quality with significantly less training and computation than state-of-the-art flow matching models.<n>We also consider generative modeling of spatial fields stemming from physical processes.
arXiv Detail & Related papers (2025-05-07T14:59:23Z) - A Simple Approach to Unifying Diffusion-based Conditional Generation [63.389616350290595]
We introduce a simple, unified framework to handle diverse conditional generation tasks.<n>Our approach enables versatile capabilities via different inference-time sampling schemes.<n>Our model supports additional capabilities like non-spatially aligned and coarse conditioning.
arXiv Detail & Related papers (2024-10-15T09:41:43Z) - StableMaterials: Enhancing Diversity in Material Generation via Semi-Supervised Learning [2.037819652873519]
We introduce StableMaterials, a novel approach for generating photorealistic physical-based rendering (PBR) materials.<n>Our method employs adversarial training to distill knowledge from existing large-scale image generation models.<n>We propose a new tileability technique that removes visual artifacts typically associated with fewer diffusion steps.
arXiv Detail & Related papers (2024-06-13T16:29:46Z) - Can Generative Models Improve Self-Supervised Representation Learning? [0.7999703756441756]
We introduce a framework that enriches the self-supervised learning (SSL) paradigm by utilizing generative models to produce semantically consistent image augmentations.<n>Our results show that our framework significantly enhances the quality of learned visual representations by up to 10% Top-1 accuracy in downstream tasks.
arXiv Detail & Related papers (2024-03-09T17:17:07Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - MatFuse: Controllable Material Generation with Diffusion Models [10.993516790237503]
MatFuse is a unified approach that harnesses the generative power of diffusion models for creation and editing of 3D materials.
Our method integrates multiple sources of conditioning, including color palettes, sketches, text, and pictures, enhancing creative possibilities.
We demonstrate the effectiveness of MatFuse under multiple conditioning settings and explore the potential of material editing.
arXiv Detail & Related papers (2023-08-22T12:54:48Z) - DiffSketcher: Text Guided Vector Sketch Synthesis through Latent
Diffusion Models [33.6615688030998]
DiffSketcher is an innovative algorithm that creates textitvectorized free-hand sketches using natural language input.
Our experiments show that DiffSketcher achieves greater quality than prior work.
arXiv Detail & Related papers (2023-06-26T13:30:38Z) - Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC [102.64648158034568]
diffusion models have quickly become the prevailing approach to generative modeling in many domains.
We propose an energy-based parameterization of diffusion models which enables the use of new compositional operators.
We find these samplers lead to notable improvements in compositional generation across a wide set of problems.
arXiv Detail & Related papers (2023-02-22T18:48:46Z) - Denoising Diffusion Probabilistic Models for Generation of Realistic
Fully-Annotated Microscopy Image Data Sets [1.07539359851877]
In this study, we demonstrate that diffusion models can effectively generate fully-annotated microscopy image data sets.
The proposed pipeline helps to reduce the reliance on manual annotations when training deep learning-based segmentation approaches.
arXiv Detail & Related papers (2023-01-02T14:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.