CRAFT-LoRA: Content-Style Personalization via Rank-Constrained Adaptation and Training-Free Fusion
- URL: http://arxiv.org/abs/2602.18936v3
- Date: Fri, 27 Feb 2026 04:20:24 GMT
- Title: CRAFT-LoRA: Content-Style Personalization via Rank-Constrained Adaptation and Training-Free Fusion
- Authors: Yu Li, Yujun Cai, Chi Zhang,
- Abstract summary: Low-Rank Adaptation (LoRA) offers an efficient personalization approach, with potential for precise control through combining LoRA weights on different concepts.<n>Existing combination techniques face persistent challenges: entanglement between content and style representations, insufficient guidance for controlling elements' influence, and unstable weight fusion that often require additional training.<n>We address these limitations through CRAFT-LoRA, with complementary components: (1) rank-constrained backbone fine-tuning that injects low-rank projection residuals to encourage learning decoupled content and style subspaces; (2) a prompt-guided approach featuring an expert encoder with specialized branches that enables semantic extension and precise control
- Score: 27.087994191559904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Personalized image generation requires effectively balancing content fidelity with stylistic consistency when synthesizing images based on text and reference examples. Low-Rank Adaptation (LoRA) offers an efficient personalization approach, with potential for precise control through combining LoRA weights on different concepts. However, existing combination techniques face persistent challenges: entanglement between content and style representations, insufficient guidance for controlling elements' influence, and unstable weight fusion that often require additional training. We address these limitations through CRAFT-LoRA, with complementary components: (1) rank-constrained backbone fine-tuning that injects low-rank projection residuals to encourage learning decoupled content and style subspaces; (2) a prompt-guided approach featuring an expert encoder with specialized branches that enables semantic extension and precise control through selective adapter aggregation; and (3) a training-free, timestep-dependent classifier-free guidance scheme that enhances generation stability by strategically adjusting noise predictions across diffusion steps. Our method significantly improves content-style disentanglement, enables flexible semantic control over LoRA module combinations, and achieves high-fidelity generation without additional retraining overhead.
Related papers
- Dynamic Training-Free Fusion of Subject and Style LoRAs [38.73465144699025]
We propose a training-free fusion framework that operates throughout the generation process.<n>Our approach consistently outperforms state-of-the-art LoRA fusion methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2026-02-17T12:42:30Z) - OmniVL-Guard: Towards Unified Vision-Language Forgery Detection and Grounding via Balanced RL [63.388513841293616]
Existing forgery detection methods fail to handle the interleaved text, images, and videos prevalent in real-world misinformation.<n>To bridge this gap, this paper targets to develop a unified framework for omnibus vision-language forgery detection and grounding.<n>We propose textbf OmniVL-Guard, a balanced reinforcement learning framework for omnibus vision-language forgery detection and grounding.
arXiv Detail & Related papers (2026-02-11T09:41:36Z) - UnHype: CLIP-Guided Hypernetworks for Dynamic LoRA Unlearning [3.8373805990749266]
UnHype is a framework that incorporates hypernetworks into single- and multi-concept Low-Rank Adaptation (LoRA) training.<n>During inference, the hypernetwork dynamically generates adaptive LoRA weights based on the CLIP embedding.<n>We evaluate UnHype across several challenging tasks, including object erasure, celebrity erasure, and explicit content removal.
arXiv Detail & Related papers (2026-02-03T11:37:08Z) - An Integrated Fusion Framework for Ensemble Learning Leveraging Gradient Boosting and Fuzzy Rule-Based Models [59.13182819190547]
Fuzzy rule-based models excel in interpretability and have seen widespread application across diverse fields.<n>They face challenges such as complex design specifications and scalability issues with large datasets.<n>This paper proposes an Integrated Fusion Framework that merges the strengths of both paradigms to enhance model performance and interpretability.
arXiv Detail & Related papers (2025-11-11T10:28:23Z) - Steerable Adversarial Scenario Generation through Test-Time Preference Alignment [58.37104890690234]
Adversarial scenario generation is a cost-effective approach for safety assessment of autonomous driving systems.<n>We introduce a new framework named textbfSteerable textbfAdversarial scenario textbfGEnerator (SAGE)<n>SAGE enables fine-grained test-time control over the trade-off between adversariality and realism without any retraining.
arXiv Detail & Related papers (2025-09-24T13:27:35Z) - AutoLoRA: Automatic LoRA Retrieval and Fine-Grained Gated Fusion for Text-to-Image Generation [32.46570968627392]
Low-rank adaptation (LoRA) have demonstrated efficacy in enabling model customization with minimal parameter overhead.<n>We introduce a novel framework that enables semantic-driven LoRA retrieval and dynamic aggregation.<n>Our approach achieves significant improvement in image generation perfermance.
arXiv Detail & Related papers (2025-08-04T06:36:00Z) - Interpretable Few-Shot Image Classification via Prototypical Concept-Guided Mixture of LoRA Experts [79.18608192761512]
Self-Explainable Models (SEMs) rely on Prototypical Concept Learning (PCL) to enable their visual recognition processes more interpretable.<n>We propose a Few-Shot Prototypical Concept Classification framework that mitigates two key challenges under low-data regimes: parametric imbalance and representation misalignment.<n>Our approach consistently outperforms existing SEMs by a notable margin, with 4.2%-8.7% relative gains in 5-way 5-shot classification.
arXiv Detail & Related papers (2025-06-05T06:39:43Z) - MultLFG: Training-free Multi-LoRA composition using Frequency-domain Guidance [44.4839416120775]
MultLFG is a framework for training-free multi-LoRA composition.<n>It uses frequency-domain guidance to achieve adaptive fusion of multiple LoRAs.<n>It substantially enhances compositional fidelity and image quality across various styles and concept sets.
arXiv Detail & Related papers (2025-05-26T21:05:28Z) - Continuous Knowledge-Preserving Decomposition with Adaptive Layer Selection for Few-Shot Class-Incremental Learning [73.59672160329296]
CKPD-FSCIL is a unified framework that unlocks the underutilized capacity of pretrained weights.<n>Our method consistently outperforms state-of-the-art approaches in both adaptability and knowledge retention.
arXiv Detail & Related papers (2025-01-09T07:18:48Z) - ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control [43.96257216397601]
We propose a new plug-and-play solution for training-free personalization of diffusion models.
RB-Modulation is built on a novel optimal controller where a style descriptor encodes the desired attributes.
Cross-attention-based feature aggregation scheme allows RB-Modulation to decouple content and style from the reference image.
arXiv Detail & Related papers (2024-05-27T17:51:08Z) - Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet)
AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition.
Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.