Related papers: Scaling Concept With Text-Guided Diffusion Models

Scaling Concept With Text-Guided Diffusion Models

URL: http://arxiv.org/abs/2410.24151v1
Date: Thu, 31 Oct 2024 17:09:55 GMT
Title: Scaling Concept With Text-Guided Diffusion Models
Authors: Chao Huang, Susan Liang, Yunlong Tang, Yapeng Tian, Anurag Kumar, Chenliang Xu,
Abstract summary: Instead of replacing a concept, can we enhance or suppress the concept itself? We introduce ScalingConcept, a simple yet effective method to scale decomposed concepts up or down in real input without introducing new elements. More importantly, ScalingConcept enables a variety of novel zero-shot applications across image and audio domains.
Score: 53.80799139331966
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-guided diffusion models have revolutionized generative tasks by producing high-fidelity content from text descriptions. They have also enabled an editing paradigm where concepts can be replaced through text conditioning (e.g., a dog to a tiger). In this work, we explore a novel approach: instead of replacing a concept, can we enhance or suppress the concept itself? Through an empirical study, we identify a trend where concepts can be decomposed in text-guided diffusion models. Leveraging this insight, we introduce ScalingConcept, a simple yet effective method to scale decomposed concepts up or down in real input without introducing new elements. To systematically evaluate our approach, we present the WeakConcept-10 dataset, where concepts are imperfect and need to be enhanced. More importantly, ScalingConcept enables a variety of novel zero-shot applications across image and audio domains, including tasks such as canonical pose generation and generative sound highlighting or removal.

Related papers

ACE: Attentional Concept Erasure in Diffusion Models [0.0]
Attentional Concept Erasure integrates a closed-form attention manipulation with lightweight fine-tuning. We show that ACE achieves state-of-the-art concept removal efficacy and robustness. Compared to prior methods, ACE better balances generality (erasing concept and related terms) and specificity (preserving unrelated content)
arXiv Detail & Related papers (2025-04-16T08:16:28Z)
Walking the Web of Concept-Class Relationships in Incrementally Trained Interpretable Models [25.84386438333865]
We show that concepts and classes form a complex web of relationships, which is susceptible to degradation and needs to be preserved and augmented across experiences. We propose a novel method - MuCIL - that uses multimodal concepts to perform classification without increasing the number of trainable parameters across experiences.
arXiv Detail & Related papers (2025-02-27T18:59:29Z)
OmniPrism: Learning Disentangled Visual Concept for Image Generation [57.21097864811521]
Creative visual concept generation often draws inspiration from specific concepts in a reference image to produce relevant outcomes. We propose OmniPrism, a visual concept disentangling approach for creative image generation. Our method learns disentangled concept representations guided by natural language and trains a diffusion model to incorporate these concepts.
arXiv Detail & Related papers (2024-12-16T18:59:52Z)
Knowledge Transfer Across Modalities with Natural Language Supervision [8.493435472659646]
We present a way to learn novel concepts by only using their textual description. Similarly to human perception, we leverage cross-modal interaction to introduce new concepts. We show that Knowledge Transfer can successfully introduce novel concepts in multimodal models, in a very efficient manner.
arXiv Detail & Related papers (2024-11-23T17:26:50Z)
How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization? [91.49559116493414]
We propose a novel Concept-Incremental text-to-image Diffusion Model (CIDM) It can resolve catastrophic forgetting and concept neglect to learn new customization tasks in a concept-incremental manner. Experiments validate that our CIDM surpasses existing custom diffusion models.
arXiv Detail & Related papers (2024-10-23T06:47:29Z)
How to Blend Concepts in Diffusion Models [48.68800153838679]
Recent methods exploit multiple latent representations and their connection, making this research question even more entangled. Our goal is to understand how operations in the latent space affect the underlying concepts. Our conclusion is that concept blending through space manipulation is possible, although the best strategy depends on the context of the blend.
arXiv Detail & Related papers (2024-07-19T13:05:57Z)
ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance [90.57708419344007]
We present **ClassDiffusion**, a technique that leverages a **semantic preservation loss** to explicitly regulate the concept space when learning a new concept. Although simple, this approach effectively prevents semantic drift during the fine-tuning process of the target concepts.
arXiv Detail & Related papers (2024-05-27T17:50:10Z)
Erasing Concepts from Text-to-Image Diffusion Models with Few-shot Unlearning [0.0]
We propose a novel concept-erasure method that updates the text encoder using few-shot unlearning. Our method can erase a concept within 10 s, making concept erasure more accessible than ever before.
arXiv Detail & Related papers (2024-05-12T14:01:05Z)
Multi-Concept T2I-Zero: Tweaking Only The Text Embeddings and Nothing Else [75.6806649860538]
We consider a more ambitious goal: natural multi-concept generation using a pre-trained diffusion model. We observe concept dominance and non-localized contribution that severely degrade multi-concept generation performance. We design a minimal low-cost solution that overcomes the above issues by tweaking the text embeddings for more realistic multi-concept text-to-image generation.
arXiv Detail & Related papers (2023-10-11T12:05:44Z)
Create Your World: Lifelong Text-to-Image Diffusion [75.14353789007902]
We propose Lifelong text-to-image Diffusion Model (L2DM) to overcome knowledge "catastrophic forgetting" for the past encountered concepts. In respect of knowledge "catastrophic forgetting", our L2DM framework devises a task-aware memory enhancement module and a elastic-concept distillation module. Our model can generate more faithful image across a range of continual text prompts in terms of both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-09-08T16:45:56Z)
The Hidden Language of Diffusion Models [70.03691458189604]
We present Conceptor, a novel method to interpret the internal representation of a textual concept by a diffusion model. We find surprising visual connections between concepts, that transcend their textual semantics. We additionally discover concepts that rely on mixtures of exemplars, biases, renowned artistic styles, or a simultaneous fusion of multiple meanings.
arXiv Detail & Related papers (2023-06-01T17:57:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.