Related papers: AutoLoRA: Automatic LoRA Retrieval and Fine-Grained Gated Fusion for Text-to-Image Generation

AutoLoRA: Automatic LoRA Retrieval and Fine-Grained Gated Fusion for Text-to-Image Generation

URL: http://arxiv.org/abs/2508.02107v1
Date: Mon, 04 Aug 2025 06:36:00 GMT
Title: AutoLoRA: Automatic LoRA Retrieval and Fine-Grained Gated Fusion for Text-to-Image Generation
Authors: Zhiwen Li, Zhongjie Duan, Die Chen, Cen Chen, Daoyuan Chen, Yaliang Li, Yingda Chen,
Abstract summary: Low-rank adaptation (LoRA) have demonstrated efficacy in enabling model customization with minimal parameter overhead.<n>We introduce a novel framework that enables semantic-driven LoRA retrieval and dynamic aggregation.<n>Our approach achieves significant improvement in image generation perfermance.
Score: 32.46570968627392
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite recent advances in photorealistic image generation through large-scale models like FLUX and Stable Diffusion v3, the practical deployment of these architectures remains constrained by their inherent intractability to parameter fine-tuning. While low-rank adaptation (LoRA) have demonstrated efficacy in enabling model customization with minimal parameter overhead, the effective utilization of distributed open-source LoRA modules faces three critical challenges: sparse metadata annotation, the requirement for zero-shot adaptation capabilities, and suboptimal fusion strategies for multi-LoRA fusion strategies. To address these limitations, we introduce a novel framework that enables semantic-driven LoRA retrieval and dynamic aggregation through two key components: (1) weight encoding-base LoRA retriever that establishes a shared semantic space between LoRA parameter matrices and text prompts, eliminating dependence on original training data, and (2) fine-grained gated fusion mechanism that computes context-specific fusion weights across network layers and diffusion timesteps to optimally integrate multiple LoRA modules during generation. Our approach achieves significant improvement in image generation perfermance, thereby facilitating scalable and data-efficient enhancement of foundational models. This work establishes a critical bridge between the fragmented landscape of community-developed LoRAs and practical deployment requirements, enabling collaborative model evolution through standardized adapter integration.

Related papers

Cross-LoRA: A Data-Free LoRA Transfer Framework across Heterogeneous LLMs [10.218401136555064]
Cross-LoRA is a framework for transferring LoRA modules between diverse base models.<n>Experiments show that Cross-LoRA achieves relative gains of up to 5.26% over base models.
arXiv Detail & Related papers (2025-08-07T10:21:08Z)
Structural Similarity-Inspired Unfolding for Lightweight Image Super-Resolution [88.20464308588889]
We propose a Structural Similarity-Inspired Unfolding (SSIU) method for efficient image SR.<n>This method is designed through unfolding an SR optimization function constrained by structural similarity.<n>Our model outperforms current state-of-the-art models, boasting lower parameter counts and reduced memory consumption.
arXiv Detail & Related papers (2025-06-13T14:29:40Z)
LoRA-Gen: Specializing Large Language Model via Online LoRA Generation [68.01864057372067]
We propose the LoRA-Gen framework to generate LoRA parameters for edge-side models based on task descriptions.<n>We merge the LoRA parameters into the edge-side model to achieve flexible specialization.<n>Our method facilitates knowledge transfer between models while significantly improving the inference efficiency of the specialized model.
arXiv Detail & Related papers (2025-06-13T10:11:01Z)
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing [53.295515505026096]
Janus-Pro-driven Prompt Parsing is a prompt- parsing module that bridges text understanding and layout generation.<n>MIGLoRA is a parameter-efficient plug-in integrating Low-Rank Adaptation into UNet (SD1.5) and DiT (SD3) backbones.<n>The proposed method achieves state-of-the-art performance on COCO and LVIS benchmarks while maintaining parameter efficiency.
arXiv Detail & Related papers (2025-03-27T00:59:14Z)
AsymLoRA: Harmonizing Data Conflicts and Commonalities in MLLMs [5.018961516699825]
AsymLoRA is a parameter-efficient tuning framework that unifies knowledge modularization and cross-modal coordination.<n>AsymLoRA consistently surpasses both vanilla LoRA, which captures only commonalities, and LoRA-MoE, which focuses solely on conflicts.
arXiv Detail & Related papers (2025-02-27T12:21:02Z)
Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs [76.40876036912537]
Large Language Models (LLMs) demonstrate strong few-shot adaptability without requiring fine-tuning.<n>Current Visual Foundation Models (VFMs) require explicit fine-tuning with sufficient tuning data.<n>We propose a framework, LoRA Recycle, that distills a meta-LoRA from diverse pre-tuned LoRAs with a meta-learning objective.
arXiv Detail & Related papers (2024-12-03T07:25:30Z)
LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement [5.162783756846019]
Foundation models (FMs) achieve strong performance across diverse tasks with task-specific fine-tuning.<n>Low-Rank Adaptation (LoRA) methods like Low-Rank Adaptation (LoRA) reduce this cost by introducing low-rank matrices for tuning fewer parameters.<n>LoRA-FAIR maintains computational and communication efficiency, yielding superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2024-11-22T14:19:01Z)
HAFLQ: Heterogeneous Adaptive Federated LoRA Fine-tuned LLM with Quantization [55.972018549438964]
Federated fine-tuning of pre-trained Large Language Models (LLMs) enables task-specific adaptation across diverse datasets while preserving privacy.<n>We propose HAFLQ (Heterogeneous Adaptive Federated Low-Rank Adaptation Fine-tuned LLM with Quantization), a novel framework for efficient and scalable fine-tuning of LLMs in heterogeneous environments.<n> Experimental results on the text classification task demonstrate that HAFLQ reduces memory usage by 31%, lowers communication cost by 49%, improves accuracy by 50%, and achieves faster convergence compared to the baseline method.
arXiv Detail & Related papers (2024-11-10T19:59:54Z)
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration [62.3751291442432]
We propose LoRA-IR, a flexible framework that dynamically leverages compact low-rank experts to facilitate efficient all-in-one image restoration. LoRA-IR consists of two training stages: degradation-guided pre-training and parameter-efficient fine-tuning. Experiments demonstrate that LoRA-IR achieves SOTA performance across 14 IR tasks and 29 benchmarks, while maintaining computational efficiency.
arXiv Detail & Related papers (2024-10-20T13:00:24Z)
Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning [57.36978335727009]
Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs) In this paper, we propose a framework that adaptively retrieves and composes multiple LoRAs based on input prompts.
arXiv Detail & Related papers (2024-06-24T05:24:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.