LoRACLR: Contrastive Adaptation for Customization of Diffusion Models
- URL: http://arxiv.org/abs/2412.09622v1
- Date: Thu, 12 Dec 2024 18:59:55 GMT
- Title: LoRACLR: Contrastive Adaptation for Customization of Diffusion Models
- Authors: Enis Simsar, Thomas Hofmann, Federico Tombari, Pinar Yanardag,
- Abstract summary: LoRACLR is a novel approach for multi-concept image generation that merges multiple LoRA models, each fine-tuned for a distinct concept, into a single, unified model.
LoRACLR uses a contrastive objective to align and merge the weight spaces of these models, ensuring compatibility while minimizing interference.
Our results highlight the effectiveness of LoRACLR in accurately merging multiple concepts, advancing the capabilities of personalized image generation.
- Score: 62.70911549650579
- License:
- Abstract: Recent advances in text-to-image customization have enabled high-fidelity, context-rich generation of personalized images, allowing specific concepts to appear in a variety of scenarios. However, current methods struggle with combining multiple personalized models, often leading to attribute entanglement or requiring separate training to preserve concept distinctiveness. We present LoRACLR, a novel approach for multi-concept image generation that merges multiple LoRA models, each fine-tuned for a distinct concept, into a single, unified model without additional individual fine-tuning. LoRACLR uses a contrastive objective to align and merge the weight spaces of these models, ensuring compatibility while minimizing interference. By enforcing distinct yet cohesive representations for each concept, LoRACLR enables efficient, scalable model composition for high-quality, multi-concept image synthesis. Our results highlight the effectiveness of LoRACLR in accurately merging multiple concepts, advancing the capabilities of personalized image generation.
Related papers
- A Simple Approach to Unifying Diffusion-based Conditional Generation [63.389616350290595]
We introduce a simple, unified framework to handle diverse conditional generation tasks.
Our approach enables versatile capabilities via different inference-time sampling schemes.
Our model supports additional capabilities like non-spatially aligned and coarse conditioning.
arXiv Detail & Related papers (2024-10-15T09:41:43Z) - MC$^2$: Multi-concept Guidance for Customized Multi-concept Generation [59.00909718832648]
We propose MC$2$, a novel approach for multi-concept customization.
By adaptively refining attention weights between visual and textual tokens, our method ensures that image regions accurately correspond to their associated concepts.
Experiments demonstrate that MC$2$ outperforms training-based methods in terms of prompt-reference alignment.
arXiv Detail & Related papers (2024-04-08T07:59:04Z) - LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models [33.379758040084894]
Multi-concept customization emerges as the challenging task within this domain.
Existing approaches often rely on training a fusion matrix of multiple Low-Rank Adaptations (LoRAs) to merge various concepts into a single image.
LoRA-Composer is a training-free framework designed for seamlessly integrating multiple LoRAs.
arXiv Detail & Related papers (2024-03-18T09:58:52Z) - OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models [47.63060402915307]
OMG is a framework designed to seamlessly integrate multiple concepts within a single image.
OMG exhibits superior performance in multi-concept personalization.
LoRA models on civitai.com can be exploited directly.
arXiv Detail & Related papers (2024-03-16T17:30:15Z) - Orthogonal Adaptation for Modular Customization of Diffusion Models [39.62438974450659]
We address a new problem called Modular Customization, with the goal of efficiently merging customized models.
We introduce Orthogonal Adaptation, a method designed to encourage the customized models, which do not have access to each other during fine-tuning.
Our proposed method is both simple and versatile, applicable to nearly all optimizable weights in the model architecture.
arXiv Detail & Related papers (2023-12-05T02:17:48Z) - Deep Unfolding Convolutional Dictionary Model for Multi-Contrast MRI
Super-resolution and Reconstruction [23.779641808300596]
We propose a multi-contrast convolutional dictionary (MC-CDic) model under the guidance of the optimization algorithm.
We employ the proximal gradient algorithm to optimize the model and unroll the iterative steps into a deep CDic model.
Experimental results demonstrate the superior performance of the proposed MC-CDic model against existing SOTA methods.
arXiv Detail & Related papers (2023-09-03T13:18:59Z) - Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept
Customization of Diffusion Models [72.67967883658957]
Public large-scale text-to-image diffusion models can be easily customized for new concepts using low-rank adaptations (LoRAs)
The utilization of multiple concept LoRAs to jointly support multiple customized concepts presents a challenge.
We propose a new framework called Mix-of-Show that addresses the challenges of decentralized multi-concept customization.
arXiv Detail & Related papers (2023-05-29T17:58:16Z) - Multi-Concept Customization of Text-to-Image Diffusion [51.8642043743222]
We propose Custom Diffusion, an efficient method for augmenting existing text-to-image models.
We find that only optimizing a few parameters in the text-to-image conditioning mechanism is sufficiently powerful to represent new concepts.
Our model generates variations of multiple new concepts and seamlessly composes them with existing concepts in novel settings.
arXiv Detail & Related papers (2022-12-08T18:57:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.