A LoRA is Worth a Thousand Pictures
- URL: http://arxiv.org/abs/2412.12048v1
- Date: Mon, 16 Dec 2024 18:18:17 GMT
- Title: A LoRA is Worth a Thousand Pictures
- Authors: Chenxi Liu, Towaki Takikawa, Alec Jacobson,
- Abstract summary: Low Rank Adaptation (LoRA) can replicate an artist's style or subject using minimal data and computation.
We show that LoRA weights alone can serve as an effective descriptor of style, without the need for additional image generation or knowledge of the original training set.
We conclude with a discussion on potential future applications, such as zero-shot LoRA fine-tuning and model attribution.
- Score: 28.928964530616593
- License:
- Abstract: Recent advances in diffusion models and parameter-efficient fine-tuning (PEFT) have made text-to-image generation and customization widely accessible, with Low Rank Adaptation (LoRA) able to replicate an artist's style or subject using minimal data and computation. In this paper, we examine the relationship between LoRA weights and artistic styles, demonstrating that LoRA weights alone can serve as an effective descriptor of style, without the need for additional image generation or knowledge of the original training set. Our findings show that LoRA weights yield better performance in clustering of artistic styles compared to traditional pre-trained features, such as CLIP and DINO, with strong structural similarities between LoRA-based and conventional image-based embeddings observed both qualitatively and quantitatively. We identify various retrieval scenarios for the growing collection of customized models and show that our approach enables more accurate retrieval in real-world settings where knowledge of the training images is unavailable and additional generation is required. We conclude with a discussion on potential future applications, such as zero-shot LoRA fine-tuning and model attribution.
Related papers
- LoRA of Change: Learning to Generate LoRA for the Editing Instruction from A Single Before-After Image Pair [116.48684498656871]
We propose the LoRA of Change (LoC) framework for image editing with visual instructions, i.e., before-after image pairs.
We learn an instruction-specific LoRA to encode the "change" in a before-after image pair, enhancing the interpretability and reusability of our model.
Our model produces high-quality images that align with user intent and support a broad spectrum of real-world visual instructions.
arXiv Detail & Related papers (2024-11-28T13:55:06Z) - DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with Diffusion [43.55179971287028]
We propose DiffLoRA, an efficient method that leverages the diffusion model as a hypernetwork to predict personalized Low-Rank Adaptation weights.
By incorporating these LoRA weights into the off-the-shelf text-to-image model, DiffLoRA enables zero-shot personalization during inference.
We introduce a novel identity-oriented LoRA weights construction pipeline to facilitate the training process of DiffLoRA.
arXiv Detail & Related papers (2024-08-13T09:00:35Z) - Dataset Size Recovery from LoRA Weights [41.031813850749174]
DSiRe is a method for recovering the number of images used to fine-tune a model.
We release a new benchmark, LoRA-WiSE, consisting of over 25000 weight snapshots from more than 2000 diverse LoRA fine-tuned models.
arXiv Detail & Related papers (2024-06-27T17:59:53Z) - MuseumMaker: Continual Style Customization without Catastrophic Forgetting [50.12727620780213]
We propose MuseumMaker, a method that enables the synthesis of images by following a set of customized styles in a never-end manner.
When facing with a new customization style, we develop a style distillation loss module to extract and learn the styles of the training data for new image generation.
It can minimize the learning biases caused by content of new training images, and address the catastrophic overfitting issue induced by few-shot images.
arXiv Detail & Related papers (2024-04-25T13:51:38Z) - CLoRA: A Contrastive Approach to Compose Multiple LoRA Models [44.037664077117945]
Low-Rank Adaptations (LoRAs) have emerged as a powerful and popular technique in the field of image generation.
CLoRA addresses the problem of seamlessly blending multiple concept LoRAs to capture a variety of concepts in one image.
Our method enables the creation of composite images that truly reflect the characteristics of each LoRA.
arXiv Detail & Related papers (2024-03-28T18:58:43Z) - Implicit Style-Content Separation using B-LoRA [61.664293840163865]
We introduce B-LoRA, a method that implicitly separate the style and content components of a single image.
By analyzing the architecture of SDXL combined with LoRA, we find that jointly learning the LoRA weights of two specific blocks achieves style-content separation.
arXiv Detail & Related papers (2024-03-21T17:20:21Z) - Multi-LoRA Composition for Image Generation [107.83002438126832]
We study multi-LoRA composition through a decoding-centric perspective.
We present two training-free methods: LoRA Switch, which alternates between different LoRAs at each denoising step, and LoRA Composite, which simultaneously incorporates all LoRAs to guide more cohesive image synthesis.
arXiv Detail & Related papers (2024-02-26T18:59:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.