TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation
- URL: http://arxiv.org/abs/2405.11236v2
- Date: Thu, 13 Jun 2024 04:42:23 GMT
- Title: TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation
- Authors: Chengcheng Feng, Mu He, Qiuyu Tian, Haojie Yin, Xiaofang Zhao, Hongwei Tang, Xingqiang Wei,
- Abstract summary: We propose an innovative method that integrates Singular Value Decomposition into the Low-Rank Adaptation (LoRA) parameter update strategy.
By incorporating SVD within the LoRA framework, our method not only effectively reduces the risk of overfitting but also enhances the stability of model outputs.
- Score: 5.195293792493412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As deep learning technology continues to advance, image generation models, especially models like Stable Diffusion, are finding increasingly widespread application in visual arts creation. However, these models often face challenges such as overfitting, lack of stability in generated results, and difficulties in accurately capturing the features desired by creators during the fine-tuning process. In response to these challenges, we propose an innovative method that integrates Singular Value Decomposition (SVD) into the Low-Rank Adaptation (LoRA) parameter update strategy, aimed at enhancing the fine-tuning efficiency and output quality of image generation models. By incorporating SVD within the LoRA framework, our method not only effectively reduces the risk of overfitting but also enhances the stability of model outputs, and captures subtle, creator-desired feature adjustments more accurately. We evaluated our method on multiple datasets, and the results show that, compared to traditional fine-tuning methods, our approach significantly improves the model's generalization ability and creative flexibility while maintaining the quality of generation. Moreover, this method maintains LoRA's excellent performance under resource-constrained conditions, allowing for significant improvements in image generation quality without sacrificing the original efficiency and resource advantages.
Related papers
- Visual Autoregressive Modeling for Image Super-Resolution [14.935662351654601]
We propose a novel visual autoregressive modeling for ISR framework with the form of next-scale prediction.
We collect large-scale data and design a training process to obtain robust generative priors.
arXiv Detail & Related papers (2025-01-31T09:53:47Z) - Enhancing Diffusion Models for High-Quality Image Generation [0.0]
This report presents the comprehensive implementation, evaluation, and optimization of Denoising Diffusion Probabilistic Models (DDPMs) and Denoising Diffusion Implicit Models (DDIMs)
During inference, these models take random noise as input and iteratively generate high-quality images as output.
The motivation behind this work is the growing demand for efficient and scalable generative AI models that can produce realistic images across diverse datasets.
arXiv Detail & Related papers (2024-12-19T00:23:15Z) - Boosting Alignment for Post-Unlearning Text-to-Image Generative Models [55.82190434534429]
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data.
This often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns.
We propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives.
arXiv Detail & Related papers (2024-12-09T21:36:10Z) - Reward Incremental Learning in Text-to-Image Generation [26.64026346266299]
We present Reward Incremental Distillation (RID), a method that mitigates forgetting with minimal computational overhead.
The experimental results demonstrate the efficacy of RID in achieving consistent, high-quality gradient generation in RIL scenarios.
arXiv Detail & Related papers (2024-11-26T10:54:33Z) - Advancing Diffusion Models: Alias-Free Resampling and Enhanced Rotational Equivariance [0.0]
diffusion models are still challenged by model-induced artifacts and limited stability in image fidelity.
We propose the integration of alias-free resampling layers into the UNet architecture of diffusion models.
Our experimental results on benchmark datasets, including CIFAR-10, MNIST, and MNIST-M, reveal consistent gains in image quality.
arXiv Detail & Related papers (2024-11-14T04:23:28Z) - Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors [75.24313405671433]
Diffusion-based image super-resolution (SR) methods have achieved remarkable success by leveraging large pre-trained text-to-image diffusion models as priors.
We introduce a novel one-step SR model, which significantly addresses the efficiency issue of diffusion-based SR methods.
Unlike existing fine-tuning strategies, we designed a degradation-guided Low-Rank Adaptation (LoRA) module specifically for SR.
arXiv Detail & Related papers (2024-09-25T16:15:21Z) - YaART: Yet Another ART Rendering Technology [119.09155882164573]
This study introduces YaART, a novel production-grade text-to-image cascaded diffusion model aligned to human preferences.
We analyze how these choices affect both the efficiency of the training process and the quality of the generated images.
We demonstrate that models trained on smaller datasets of higher-quality images can successfully compete with those trained on larger datasets.
arXiv Detail & Related papers (2024-04-08T16:51:19Z) - TCIG: Two-Stage Controlled Image Generation with Quality Enhancement
through Diffusion [0.0]
A two-stage method that combines controllability and high quality in the generation of images is proposed.
By separating controllability from high quality, This method achieves outstanding results.
arXiv Detail & Related papers (2024-03-02T13:59:02Z) - Auto-regressive Image Synthesis with Integrated Quantization [55.51231796778219]
This paper presents a versatile framework for conditional image generation.
It incorporates the inductive bias of CNNs and powerful sequence modeling of auto-regression.
Our method achieves superior diverse image generation performance as compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-21T22:19:17Z) - A Generic Approach for Enhancing GANs by Regularized Latent Optimization [79.00740660219256]
We introduce a generic framework called em generative-model inference that is capable of enhancing pre-trained GANs effectively and seamlessly.
Our basic idea is to efficiently infer the optimal latent distribution for the given requirements using Wasserstein gradient flow techniques.
arXiv Detail & Related papers (2021-12-07T05:22:50Z) - Characteristic Regularisation for Super-Resolving Face Images [81.84939112201377]
Existing facial image super-resolution (SR) methods focus mostly on improving artificially down-sampled low-resolution (LR) imagery.
Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data.
This renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution.
We formulate a method that joins the advantages of conventional SR and UDA models.
arXiv Detail & Related papers (2019-12-30T16:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.