Related papers: ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs

URL: http://arxiv.org/abs/2311.13600v1
Date: Wed, 22 Nov 2023 18:59:36 GMT
Title: ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Authors: Viraj Shah, Nataniel Ruiz, Forrester Cole, Erika Lu, Svetlana Lazebnik, Yuanzhen Li, Varun Jampani
Abstract summary: Low-rank adaptations (LoRA) have been proposed as a parameter-efficient way of achieving concept-driven personalization. We propose ZipLoRA, a method to cheaply and effectively merge independently trained style and subject LoRAs. Experiments show that ZipLoRA can generate compelling results with meaningful improvements over baselines in subject and style fidelity.
Score: 56.85106417530364
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Methods for finetuning generative models for concept-driven personalization generally achieve strong results for subject-driven or style-driven generation. Recently, low-rank adaptations (LoRA) have been proposed as a parameter-efficient way of achieving concept-driven personalization. While recent work explores the combination of separate LoRAs to achieve joint generation of learned styles and subjects, existing techniques do not reliably address the problem; they often compromise either subject fidelity or style fidelity. We propose ZipLoRA, a method to cheaply and effectively merge independently trained style and subject LoRAs in order to achieve generation of any user-provided subject in any user-provided style. Experiments on a wide range of subject and style combinations show that ZipLoRA can generate compelling results with meaningful improvements over baselines in subject and style fidelity while preserving the ability to recontextualize. Project page: https://ziplora.github.io

Related papers

Dance Like a Chicken: Low-Rank Stylization for Human Motion Diffusion [28.94750481325469]
We introduce LoRA-MDM, a framework for motion stylization that generalizes to complex actions while maintaining editability. Our key insight is that adapting the generative prior to include the style, while preserving its overall distribution, is more effective than modifying each individual motion during generation. LoRA-MDM learns to adapt the prior to include the reference style using only a few samples.
arXiv Detail & Related papers (2025-03-25T11:23:34Z)
Modular Customization of Diffusion Models via Blockwise-Parameterized Low-Rank Adaptation [73.16975077770765]
Modular customization is essential for applications like concept stylization and multi-concept customization. Instant merging methods often cause identity loss and interference of individual merged concepts. We propose BlockLoRA, an instant merging method designed to efficiently combine multiple concepts while accurately preserving individual concepts' identity.
arXiv Detail & Related papers (2025-03-11T16:10:36Z)
K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs [25.998945090549874]
We argue that the intrinsic properties of LoRA can effectively guide diffusion models in merging learned subject and style. In each attention layer, K-LoRA compares the Top-K elements in each LoRA to be fused, determining which LoRA to select for optimal fusion. Experimental results demonstrate that the proposed method effectively integrates the subject and style information learned by the original LoRAs.
arXiv Detail & Related papers (2025-02-25T18:59:12Z)
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models [62.70911549650579]
LoRACLR is a novel approach for multi-concept image generation that merges multiple LoRA models, each fine-tuned for a distinct concept, into a single, unified model. LoRACLR uses a contrastive objective to align and merge the weight spaces of these models, ensuring compatibility while minimizing interference. Our results highlight the effectiveness of LoRACLR in accurately merging multiple concepts, advancing the capabilities of personalized image generation.
arXiv Detail & Related papers (2024-12-12T18:59:55Z)
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation [28.098287135605364]
We introduce LoRA.rar, a method that improves image quality and achieves a remarkable speedup of over $4000times$ in the merging process. LoRA.rar pre-trains a hypernetwork on a diverse set of content-style LoRA pairs, learning an efficient merging strategy that generalizes to new, unseen content-style pairs. Our method significantly outperforms the current state of the art in both content and style fidelity, as validated by MLLM assessments and human evaluations.
arXiv Detail & Related papers (2024-12-06T16:04:56Z)
UnZipLoRA: Separating Content and Style from a Single Image [16.61595725708187]
UnZipLoRA is a method for decomposing an image into its constituent subject and style. UnZipLoRA disentangles these elements from a single image by training both the LoRAs simultaneously.
arXiv Detail & Related papers (2024-12-05T18:59:50Z)
CopRA: A Progressive LoRA Training Strategy [9.847045610578073]
Low-Rank Adaptation (LoRA) is a parameter-efficient technique for fine-tuning foundation models. In this work, we propose a novel progressive training strategy for LoRA with random layer dropping. We refer to this method as Cooperative LoRA (CopRA)
arXiv Detail & Related papers (2024-10-30T11:07:09Z)
Exact Aggregation for Federated and Efficient Fine-Tuning of Foundation Models [5.1613368481802455]
Low-Rank Adaptation (LoRA) is a popular technique for efficient fine-tuning of foundation models. We propose Federated Exact LoRA, or FedEx-LoRA, which adds a residual error term to the pretrained frozen weight matrix. Our approach achieves exact updates with minimal computational and communication overhead, preserving LoRA's efficiency.
arXiv Detail & Related papers (2024-10-12T08:22:44Z)
DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with Diffusion [43.55179971287028]
We propose DiffLoRA, an efficient method that leverages the diffusion model as a hypernetwork to predict personalized Low-Rank Adaptation weights. By incorporating these LoRA weights into the off-the-shelf text-to-image model, DiffLoRA enables zero-shot personalization during inference. We introduce a novel identity-oriented LoRA weights construction pipeline to facilitate the training process of DiffLoRA.
arXiv Detail & Related papers (2024-08-13T09:00:35Z)
Lifelong Personalized Low-Rank Adaptation of Large Language Models for Recommendation [50.837277466987345]
We focus on the field of large language models (LLMs) for recommendation. We propose RecLoRA, which incorporates a Personalized LoRA module that maintains independent LoRAs for different users. We also design a Few2Many Learning Strategy, using a conventional recommendation model as a lens to magnify small training spaces to full spaces.
arXiv Detail & Related papers (2024-08-07T04:20:28Z)
Mixture of LoRA Experts [87.50120181861362]
This paper introduces the Mixture of LoRA Experts (MoLE) approach, which harnesses hierarchical control and unfettered branch selection. The MoLE approach achieves superior LoRA fusion performance in comparison to direct arithmetic merging.
arXiv Detail & Related papers (2024-04-21T11:59:53Z)
Implicit Style-Content Separation using B-LoRA [61.664293840163865]
We introduce B-LoRA, a method that implicitly separate the style and content components of a single image. By analyzing the architecture of SDXL combined with LoRA, we find that jointly learning the LoRA weights of two specific blocks achieves style-content separation.
arXiv Detail & Related papers (2024-03-21T17:20:21Z)
Block-wise LoRA: Revisiting Fine-grained LoRA for Effective Personalization and Stylization in Text-to-Image Generation [2.2356314962198836]
The objective of personalization and stylization in text-to-image is to instruct a pre-trained diffusion model to analyze new concepts introduced by users and incorporate them into expected styles. We propose block-wise Low-Rank Adaptation (LoRA) to perform fine-grained fine-tuning for different blocks of SD.
arXiv Detail & Related papers (2024-03-12T10:38:03Z)
ResLoRA: Identity Residual Mapping in Low-Rank Adaption [96.59370314485074]
We propose ResLoRA, an improved framework of low-rank adaptation (LoRA) Our method can achieve better results in fewer training steps without any extra trainable parameters or inference cost compared to LoRA. The experiments on NLG, NLU, and text-to-image tasks demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-02-28T04:33:20Z)
Style Interleaved Learning for Generalizable Person Re-identification [69.03539634477637]
We propose a novel style interleaved learning (IL) framework for DG ReID training. Unlike conventional learning strategies, IL incorporates two forward propagations and one backward propagation for each iteration. We show that our model consistently outperforms state-of-the-art methods on large-scale benchmarks for DG ReID.
arXiv Detail & Related papers (2022-07-07T07:41:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.