VLM-Guided Adaptive Negative Prompting for Creative Generation
- URL: http://arxiv.org/abs/2510.10715v1
- Date: Sun, 12 Oct 2025 17:34:59 GMT
- Title: VLM-Guided Adaptive Negative Prompting for Creative Generation
- Authors: Shelly Golan, Yotam Nitzan, Zongze Wu, Or Patashnik,
- Abstract summary: Creative generation is the synthesis of new, surprising, and valuable samples that reflect user intent yet cannot be envisioned in advance.<n>We propose VLM-Guided Adaptive Negative-Prompting, a training-free, inference-time method that promotes creative image generation.<n>We show consistent gains in creative novelty with negligible computational overhead.
- Score: 21.534474554320823
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Creative generation is the synthesis of new, surprising, and valuable samples that reflect user intent yet cannot be envisioned in advance. This task aims to extend human imagination, enabling the discovery of visual concepts that exist in the unexplored spaces between familiar domains. While text-to-image diffusion models excel at rendering photorealistic scenes that faithfully match user prompts, they still struggle to generate genuinely novel content. Existing approaches to enhance generative creativity either rely on interpolation of image features, which restricts exploration to predefined categories, or require time-intensive procedures such as embedding optimization or model fine-tuning. We propose VLM-Guided Adaptive Negative-Prompting, a training-free, inference-time method that promotes creative image generation while preserving the validity of the generated object. Our approach utilizes a vision-language model (VLM) that analyzes intermediate outputs of the generation process and adaptively steers it away from conventional visual concepts, encouraging the emergence of novel and surprising outputs. We evaluate creativity through both novelty and validity, using statistical metrics in the CLIP embedding space. Through extensive experiments, we show consistent gains in creative novelty with negligible computational overhead. Moreover, unlike existing methods that primarily generate single objects, our approach extends to complex scenarios, such as generating coherent sets of creative objects and preserving creativity within elaborate compositional prompts. Our method integrates seamlessly into existing diffusion pipelines, offering a practical route to producing creative outputs that venture beyond the constraints of textual descriptions.
Related papers
- Inspiration Seeds: Learning Non-Literal Visual Combinations for Generative Exploration [13.00602873238112]
We propose Inspiration Seeds, a generative framework that shifts image generation from final execution to exploratory ideation.<n>We use CLIP Sparse Autoencoders to extract editing directions in CLIP latent space and isolate concept pairs.
arXiv Detail & Related papers (2026-02-09T13:00:16Z) - Beyond Pixels: Visual Metaphor Transfer via Schema-Driven Agentic Reasoning [56.24016465596292]
A visual metaphor constitutes a high-order form of human creativity, employing cross-domain semantic fusion to transform abstract concepts into impactful visual rhetoric.<n>We introduce the task of Visual Metaphor Transfer (VMT), which challenges models to autonomously decouple the "creative essence" from a reference image and re-materialize that abstract logic onto a user-specified subject.<n>Our method significantly outperforms SOTA baselines in metaphor consistency, analogy appropriateness, and visual creativity, paving the way for automated high-impact creative applications in advertising and media.
arXiv Detail & Related papers (2026-02-01T17:01:36Z) - Creative Image Generation with Diffusion Models [10.05957748073635]
We propose a novel framework for creative generation using diffusion models, where creativity is associated with the inverse probability of an image's existence in the CLIP embedding space.<n>Our method calculates the probability distribution of generated images and drives it towards low-probability regions to produce rare, imaginative, and visually captivating outputs.
arXiv Detail & Related papers (2026-01-29T18:48:48Z) - ThematicPlane: Bridging Tacit User Intent and Latent Spaces for Image Generation [49.805992099208595]
We introduce ThematicPlane, a system that enables users to navigate and manipulate high-level semantic concepts.<n>This interface bridges the gap between tacit creative intent and system control.
arXiv Detail & Related papers (2025-08-08T06:57:14Z) - Cooking Up Creativity: Enhancing LLM Creativity through Structured Recombination [46.79423188943526]
We introduce a novel approach that enhances Large Language Models (LLMs) creativity.<n>We apply LLMs for translating between natural language and structured representations, and perform the core creative leap.<n>We demonstrate our approach in the culinary domain with DishCOVER, a model that generates creative recipes.
arXiv Detail & Related papers (2025-04-29T11:13:06Z) - Probing and Inducing Combinational Creativity in Vision-Language Models [52.76981145923602]
Recent advances in Vision-Language Models (VLMs) have sparked debate about whether their outputs reflect combinational creativity.<n>We propose the Identification-Explanation-Implication (IEI) framework, which decomposes creative processes into three levels.<n>To validate this framework, we curate CreativeMashup, a high-quality dataset of 666 artist-generated visual mashups annotated according to the IEI framework.
arXiv Detail & Related papers (2025-04-17T17:38:18Z) - Redefining <Creative> in Dictionary: Towards an Enhanced Semantic Understanding of Creative Generation [39.93527514513576]
Creative'' remains an inherently abstract concept for both humans and diffusion models.<n>Current methods rely heavily on reference prompts or images to achieve a creative effect.<n>We introduce CreTok, which brings meta-creativity to diffusion models by redefining creative' as a new token, texttCreTok>.<n>Code will be made available at https://github.com/fu-feng/CreTok.
arXiv Detail & Related papers (2024-10-31T17:19:03Z) - ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior
Constraints [56.824187892204314]
We present the task of creative text-to-image generation, where we seek to generate new members of a broad category.
We show that the creative generation problem can be formulated as an optimization process over the output space of the diffusion prior.
We incorporate a question-answering Vision-Language Model (VLM) that adaptively adds new constraints to the optimization problem, encouraging the model to discover increasingly more unique creations.
arXiv Detail & Related papers (2023-08-03T17:04:41Z) - Towards Creativity Characterization of Generative Models via Group-based
Subset Scanning [64.6217849133164]
We propose group-based subset scanning to identify, quantify, and characterize creative processes.
We find that creative samples generate larger subsets of anomalies than normal or non-creative samples across datasets.
arXiv Detail & Related papers (2022-03-01T15:07:14Z) - Towards creativity characterization of generative models via group-based
subset scanning [51.84144826134919]
We propose group-based subset scanning to quantify, detect, and characterize creative processes.
Creative samples generate larger subsets of anomalies than normal or non-creative samples across datasets.
arXiv Detail & Related papers (2021-04-01T14:07:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.