Creative Image Generation with Diffusion Models
- URL: http://arxiv.org/abs/2601.22125v2
- Date: Mon, 02 Feb 2026 20:24:08 GMT
- Title: Creative Image Generation with Diffusion Models
- Authors: Kunpeng Song, Ahmed Elgammal,
- Abstract summary: We propose a novel framework for creative generation using diffusion models, where creativity is associated with the inverse probability of an image's existence in the CLIP embedding space.<n>Our method calculates the probability distribution of generated images and drives it towards low-probability regions to produce rare, imaginative, and visually captivating outputs.
- Score: 10.05957748073635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Creative image generation has emerged as a compelling area of research, driven by the need to produce novel and high-quality images that expand the boundaries of imagination. In this work, we propose a novel framework for creative generation using diffusion models, where creativity is associated with the inverse probability of an image's existence in the CLIP embedding space. Unlike prior approaches that rely on a manual blending of concepts or exclusion of subcategories, our method calculates the probability distribution of generated images and drives it towards low-probability regions to produce rare, imaginative, and visually captivating outputs. We also introduce pullback mechanisms, achieving high creativity without sacrificing visual fidelity. Extensive experiments on text-to-image diffusion models demonstrate the effectiveness and efficiency of our creative generation framework, showcasing its ability to produce unique, novel, and thought-provoking images. This work provides a new perspective on creativity in generative models, offering a principled method to foster innovation in visual content synthesis.
Related papers
- Creativity in AI as Emergence from Domain-Limited Generative Models [0.0]
evaluative frameworks largely treat creativity as a property to be assessed rather than as a phenomenon to be explicitly modeled.<n>This paper proposes a generative perspective on creativity in AI, framing it as an emergent property of domain-limited generative models embedded within bounded informational environments.
arXiv Detail & Related papers (2026-01-13T09:52:14Z) - CREward: A Type-Specific Creativity Reward Model [23.62496736021293]
CREward is a type-specific creativity reward model that spans three creativity axes," geometry, material, and texture.<n>We analyze the correlation between human judgments and predictions by large vision-language models (LVLMs)<n>We explore three applications of CREward: creativity assessment, explainable creativity, and creative sample acquisition for both human design inspiration and guiding creative generation through low-rank adaptation.
arXiv Detail & Related papers (2025-11-25T07:00:42Z) - VLM-Guided Adaptive Negative Prompting for Creative Generation [21.534474554320823]
Creative generation is the synthesis of new, surprising, and valuable samples that reflect user intent yet cannot be envisioned in advance.<n>We propose VLM-Guided Adaptive Negative-Prompting, a training-free, inference-time method that promotes creative image generation.<n>We show consistent gains in creative novelty with negligible computational overhead.
arXiv Detail & Related papers (2025-10-12T17:34:59Z) - Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation [54.588082888166504]
We present Mogao, a unified framework that enables interleaved multi-modal generation through a causal approach.<n>Mogoo integrates a set of key technical improvements in architecture design, including a deep-fusion design, dual vision encoders, interleaved rotary position embeddings, and multi-modal classifier-free guidance.<n>Experiments show that Mogao achieves state-of-the-art performance in multi-modal understanding and text-to-image generation, but also excels in producing high-quality, coherent interleaved outputs.
arXiv Detail & Related papers (2025-05-08T17:58:57Z) - Probing and Inducing Combinational Creativity in Vision-Language Models [52.76981145923602]
Recent advances in Vision-Language Models (VLMs) have sparked debate about whether their outputs reflect combinational creativity.<n>We propose the Identification-Explanation-Implication (IEI) framework, which decomposes creative processes into three levels.<n>To validate this framework, we curate CreativeMashup, a high-quality dataset of 666 artist-generated visual mashups annotated according to the IEI framework.
arXiv Detail & Related papers (2025-04-17T17:38:18Z) - DreamCreature: Crafting Photorealistic Virtual Creatures from
Imagination [140.1641573781066]
We introduce a novel task, Virtual Creatures Generation: Given a set of unlabeled images of the target concepts, we aim to train a T2I model capable of creating new, hybrid concepts.
We propose a new method called DreamCreature, which identifies and extracts the underlying sub-concepts.
The T2I thus adapts to generate novel concepts with faithful structures and photorealistic appearance.
arXiv Detail & Related papers (2023-11-27T01:24:31Z) - ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior
Constraints [56.824187892204314]
We present the task of creative text-to-image generation, where we seek to generate new members of a broad category.
We show that the creative generation problem can be formulated as an optimization process over the output space of the diffusion prior.
We incorporate a question-answering Vision-Language Model (VLM) that adaptively adds new constraints to the optimization problem, encouraging the model to discover increasingly more unique creations.
arXiv Detail & Related papers (2023-08-03T17:04:41Z) - Diffusion idea exploration for art generation [0.10152838128195467]
Diffusion models have recently outperformed other generative models in image generation tasks using cross modal data as guiding information.
The initial experiments for this task of novel image generation demonstrated promising qualitative results.
arXiv Detail & Related papers (2023-07-11T02:35:26Z) - Towards Creativity Characterization of Generative Models via Group-based
Subset Scanning [64.6217849133164]
We propose group-based subset scanning to identify, quantify, and characterize creative processes.
We find that creative samples generate larger subsets of anomalies than normal or non-creative samples across datasets.
arXiv Detail & Related papers (2022-03-01T15:07:14Z) - Towards creativity characterization of generative models via group-based
subset scanning [51.84144826134919]
We propose group-based subset scanning to quantify, detect, and characterize creative processes.
Creative samples generate larger subsets of anomalies than normal or non-creative samples across datasets.
arXiv Detail & Related papers (2021-04-01T14:07:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.