Parameter-Efficient MoE LoRA for Few-Shot Multi-Style Editing
- URL: http://arxiv.org/abs/2511.11236v2
- Date: Fri, 21 Nov 2025 04:59:34 GMT
- Title: Parameter-Efficient MoE LoRA for Few-Shot Multi-Style Editing
- Authors: Cong Cao, Yujie Xu, Xiaodong Xu,
- Abstract summary: We propose a parameter-efficient multi-style Mixture-of-Experts Low-Rank Adaptation (MoE LoRA) with style-specific and style-shared routing mechanisms.<n>Our proposed method outperforms existing state-of-the-art approaches with significantly fewer LoRA parameters.
- Score: 6.95397292284568
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, image editing has garnered growing attention. However, general image editing models often fail to produce satisfactory results when confronted with new styles. The challenge lies in how to effectively fine-tune general image editing models to new styles using only a limited amount of paired data. To address this issue, this paper proposes a novel few-shot style editing framework. For this task, we construct a benchmark dataset that encompasses five distinct styles. Correspondingly, we propose a parameter-efficient multi-style Mixture-of-Experts Low-Rank Adaptation (MoE LoRA) with style-specific and style-shared routing mechanisms for jointly fine-tuning multiple styles. The style-specific routing ensures that different styles do not interfere with one another, while the style-shared routing adaptively allocates shared MoE LoRAs to learn common patterns. Our MoE LoRA can automatically determine the optimal ranks for each layer through a novel metric-guided approach that estimates the importance score of each single-rank component. Additionally, we explore the optimal location to insert LoRA within the Diffusion in Transformer (DiT) model and integrate adversarial learning and flow matching to guide the diffusion training process. Experimental results demonstrate that our proposed method outperforms existing state-of-the-art approaches with significantly fewer LoRA parameters.
Related papers
- LoRAverse: A Submodular Framework to Retrieve Diverse Adapters for Diffusion Models [10.732709225098342]
Low-rank Adaptation (LoRA) models have revolutionized the personalization of pre-trained diffusion models.<n>Despite the availability of over 100K LoRA adapters on platforms like Civit.ai, users often face challenges in navigating, selecting, and effectively utilizing the most suitable adapters.
arXiv Detail & Related papers (2025-10-16T17:59:45Z) - Neural Scene Designer: Self-Styled Semantic Image Manipulation [67.43125248646653]
We introduce the Neural Scene Designer (NSD), a novel framework that enables photo-realistic manipulation of user-specified scene regions.<n>NSD ensures both semantic alignment with user intent and stylistic consistency with the surrounding environment.<n>To capture fine-grained style representations, we propose the Progressive Self-style Representational Learning (PSRL) module.
arXiv Detail & Related papers (2025-09-01T11:59:03Z) - Subject or Style: Adaptive and Training-Free Mixture of LoRAs [3.8443430569753025]
EST-LoRA is a training-free adaptive LoRA fusion method.<n>It considers three critical factors: underlineEnergy of matrix, underlineStyle discrepancy scores and underlineTime steps.<n>It outperforms state-of-the-art methods in both qualitative and quantitative evaluations.
arXiv Detail & Related papers (2025-08-04T08:05:18Z) - Dance Like a Chicken: Low-Rank Stylization for Human Motion Diffusion [28.94750481325469]
We introduce LoRA-MDM, a framework for motion stylization that generalizes to complex actions while maintaining editability.<n>Our key insight is that adapting the generative prior to include the style, while preserving its overall distribution, is more effective than modifying each individual motion during generation.<n>LoRA-MDM learns to adapt the prior to include the reference style using only a few samples.
arXiv Detail & Related papers (2025-03-25T11:23:34Z) - LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation [28.098287135605364]
We introduce LoRA$.$rar, a method that improves image quality and achieves a remarkable speedup of over $4000times$ in the merging process.<n>Our method significantly outperforms the current state of the art in both content and style fidelity, as validated by MLLM assessments and human evaluations.
arXiv Detail & Related papers (2024-12-06T16:04:56Z) - MuseumMaker: Continual Style Customization without Catastrophic Forgetting [50.12727620780213]
We propose MuseumMaker, a method that enables the synthesis of images by following a set of customized styles in a never-end manner.
When facing with a new customization style, we develop a style distillation loss module to extract and learn the styles of the training data for new image generation.
It can minimize the learning biases caused by content of new training images, and address the catastrophic overfitting issue induced by few-shot images.
arXiv Detail & Related papers (2024-04-25T13:51:38Z) - Implicit Style-Content Separation using B-LoRA [61.664293840163865]
We introduce B-LoRA, a method that implicitly separate the style and content components of a single image.
By analyzing the architecture of SDXL combined with LoRA, we find that jointly learning the LoRA weights of two specific blocks achieves style-content separation.
arXiv Detail & Related papers (2024-03-21T17:20:21Z) - ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs [56.85106417530364]
Low-rank adaptations (LoRA) have been proposed as a parameter-efficient way of achieving concept-driven personalization.
We propose ZipLoRA, a method to cheaply and effectively merge independently trained style and subject LoRAs.
Experiments show that ZipLoRA can generate compelling results with meaningful improvements over baselines in subject and style fidelity.
arXiv Detail & Related papers (2023-11-22T18:59:36Z) - StyleAdapter: A Unified Stylized Image Generation Model [97.24936247688824]
StyleAdapter is a unified stylized image generation model capable of producing a variety of stylized images.
It can be integrated with existing controllable synthesis methods, such as T2I-adapter and ControlNet.
arXiv Detail & Related papers (2023-09-04T19:16:46Z) - Learning Graph Neural Networks for Image Style Transfer [131.73237185888215]
State-of-the-art parametric and non-parametric style transfer approaches are prone to either distorted local style patterns due to global statistics alignment, or unpleasing artifacts resulting from patch mismatching.
In this paper, we study a novel semi-parametric neural style transfer framework that alleviates the deficiency of both parametric and non-parametric stylization.
arXiv Detail & Related papers (2022-07-24T07:41:31Z) - StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval [119.03470556503942]
Crossmodal matching problem is typically solved by learning a joint embedding space where semantic content shared between photo and sketch modalities are preserved.
An effective model needs to explicitly account for this style diversity, crucially, to unseen user styles.
Our model can not only disentangle the cross-modal shared semantic content, but can adapt the disentanglement to any unseen user style as well, making the model truly agnostic.
arXiv Detail & Related papers (2021-03-29T15:44:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.