Conditional LoRA Parameter Generation
- URL: http://arxiv.org/abs/2408.01415v1
- Date: Fri, 2 Aug 2024 17:43:34 GMT
- Title: Conditional LoRA Parameter Generation
- Authors: Xiaolong Jin, Kai Wang, Dongwen Tang, Wangbo Zhao, Yukun Zhou, Junshu Tang, Yang You,
- Abstract summary: We propose COND P-DIFF, a novel approach that demonstrates the feasibility of controllable high-performance parameter generation.
Experimental results in both computer vision and natural language processing domains consistently demonstrate that COND P-DIFF can generate high-performance parameters conditioned on the given task.
Our work paves the way for further exploration of condition-driven parameter generation, offering a promising direction for task-specific adaptation of neural networks.
- Score: 18.34892473337235
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative models have achieved remarkable success in image, video, and text domains. Inspired by this, researchers have explored utilizing generative models to generate neural network parameters. However, these efforts have been limited by the parameter size and the practicality of generating high-performance parameters. In this paper, we propose COND P-DIFF, a novel approach that demonstrates the feasibility of controllable high-performance parameter generation, particularly for LoRA (Low-Rank Adaptation) weights, during the fine-tuning process. Specifically, we employ an autoencoder to extract efficient latent representations for parameters. We then train a conditional latent diffusion model to synthesize high-performing model parameters from random noise based on specific task conditions. Experimental results in both computer vision and natural language processing domains consistently demonstrate that COND P-DIFF can generate high-performance parameters conditioned on the given task. Moreover, we observe that the parameter distribution generated by COND P-DIFF exhibits differences compared to the distribution obtained through normal optimization methods, indicating a certain level of generalization capability. Our work paves the way for further exploration of condition-driven parameter generation, offering a promising direction for task-specific adaptation of neural networks.
Related papers
- Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems [2.7212274374272543]
We develop a joint conditional diffusion model-based inverse problem solver (JCDI)
JCDI incorporates a joint conditioning architecture with simultaneous inputs of multi-event observations to improve parameter generalizability.
Simulation studies on the WECC CLM show that the proposed JCDI effectively reduces uncertainties of degenerate parameters.
arXiv Detail & Related papers (2024-11-15T18:53:08Z) - LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method that effectively adapts large pre-trained models for downstream tasks.
We propose a novel approach that employs a low rank tensor parametrization for model updates.
Our method is both efficient and effective for fine-tuning large language models, achieving a substantial reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models.
We propose a novel model fine-tuning method to make full use of these ineffective parameters.
Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z) - Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models [73.88009808326387]
We propose a novel spectrum-aware adaptation framework for generative models.
Our method adjusts both singular values and their basis vectors of pretrained weights.
We introduce Spectral Ortho Decomposition Adaptation (SODA), which balances computational efficiency and representation capacity.
arXiv Detail & Related papers (2024-05-31T17:43:35Z) - ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections [59.839926875976225]
We propose the ETHER transformation family, which performs Efficient fineTuning via HypErplane Reflections.
In particular, we introduce ETHER and its relaxation ETHER+, which match or outperform existing PEFT methods with significantly fewer parameters.
arXiv Detail & Related papers (2024-05-30T17:26:02Z) - Sine Activated Low-Rank Matrices for Parameter Efficient Learning [25.12262017296922]
We propose a novel theoretical framework that integrates a sinusoidal function within the low-rank decomposition process.
Our method proves to be an enhancement for existing low-rank models, as evidenced by its successful application in Vision Transformers (ViT), Large Language Models (LLMs), Neural Radiance Fields (NeRF)
arXiv Detail & Related papers (2024-03-28T08:58:20Z) - Low-Rank Representations Meets Deep Unfolding: A Generalized and
Interpretable Network for Hyperspectral Anomaly Detection [41.50904949744355]
Current hyperspectral anomaly detection (HAD) benchmark datasets suffer from low resolution, simple background, and small size of the detection data.
These factors also limit the performance of the well-known low-rank representation (LRR) models in terms of robustness.
We build a new set of HAD benchmark datasets for improving the robustness of the HAD algorithm in complex scenarios, AIR-HAD for short.
arXiv Detail & Related papers (2024-02-23T14:15:58Z) - Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared
Pre-trained Language Models [109.06052781040916]
We introduce a technique to enhance the inference efficiency of parameter-shared language models.
We also propose a simple pre-training technique that leads to fully or partially shared models.
Results demonstrate the effectiveness of our methods on both autoregressive and autoencoding PLMs.
arXiv Detail & Related papers (2023-10-19T15:13:58Z) - PriorCVAE: scalable MCMC parameter inference with Bayesian deep
generative modelling [12.820453440015553]
Recent have shown that GP priors can be encoded using deep generative models such as variational autoencoders (VAEs)
We show how VAEs can serve as drop-in replacements for the original priors during MCMC inference.
We propose PriorCVAE to encode solutions of ODEs.
arXiv Detail & Related papers (2023-04-09T20:23:26Z) - On the Parameter Combinations That Matter and on Those That do Not [0.0]
We present a data-driven approach to characterizing nonidentifiability of a model's parameters.
By employing Diffusion Maps and their extensions, we discover the minimal combinations of parameters required to characterize the dynamic output behavior.
arXiv Detail & Related papers (2021-10-13T13:46:23Z) - Understanding Overparameterization in Generative Adversarial Networks [56.57403335510056]
Generative Adversarial Networks (GANs) are used to train non- concave mini-max optimization problems.
A theory has shown the importance of the gradient descent (GD) to globally optimal solutions.
We show that in an overized GAN with a $1$-layer neural network generator and a linear discriminator, the GDA converges to a global saddle point of the underlying non- concave min-max problem.
arXiv Detail & Related papers (2021-04-12T16:23:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.