Related papers: FouRA: Fourier Low Rank Adaptation

FouRA: Fourier Low Rank Adaptation

URL: http://arxiv.org/abs/2406.08798v1
Date: Thu, 13 Jun 2024 04:27:37 GMT
Title: FouRA: Fourier Low Rank Adaptation
Authors: Shubhankar Borse, Shreya Kadambi, Nilesh Prasad Pandey, Kartikeya Bhardwaj, Viswanath Ganapathy, Sweta Priyadarshi, Risheek Garrepalli, Rafael Esteves, Munawar Hayat, Fatih Porikli,
Abstract summary: We present FouRA, a novel low-rank method that learns projections in the Fourier domain. We show that FouRA successfully solves the problems related to data copying and distribution collapse. We also demonstrate its merits for language tasks on the GLUE benchmark.
Score: 47.485305992204935
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While Low-Rank Adaptation (LoRA) has proven beneficial for efficiently fine-tuning large models, LoRA fine-tuned text-to-image diffusion models lack diversity in the generated images, as the model tends to copy data from the observed training samples. This effect becomes more pronounced at higher values of adapter strength and for adapters with higher ranks which are fine-tuned on smaller datasets. To address these challenges, we present FouRA, a novel low-rank method that learns projections in the Fourier domain along with learning a flexible input-dependent adapter rank selection strategy. Through extensive experiments and analysis, we show that FouRA successfully solves the problems related to data copying and distribution collapse while significantly improving the generated image quality. We demonstrate that FouRA enhances the generalization of fine-tuned models thanks to its adaptive rank selection. We further show that the learned projections in the frequency domain are decorrelated and prove effective when merging multiple adapters. While FouRA is motivated for vision tasks, we also demonstrate its merits for language tasks on the GLUE benchmark.

Related papers

WaRA: Wavelet Low Rank Adaptation [4.5875111164923545]
WaRA is a novel PEFT method that decomposes the weight update matrix into a multi-resolution representation.<n>We demonstrate that WaRA performs superior on diverse vision tasks, including image generation, classification, and semantic segmentation.
arXiv Detail & Related papers (2025-06-25T07:31:40Z)
LoFT: LoRA-fused Training Dataset Generation with Few-shot Guidance [96.6544564242316]
We introduce a novel dataset generation framework named LoFT, LoRA-Fused Training-data Generation with Few-shot Guidance.<n>Our method fine-tunes LoRA weights on individual real images and fuses them at inference time, producing synthetic images that combine the features of real images for improved diversity and fidelity of generated data.<n>Our experiments show that training on LoFT-generated data consistently outperforms other synthetic dataset methods, significantly increasing accuracy as the dataset size increases.
arXiv Detail & Related papers (2025-05-16T21:17:55Z)
Communication-Efficient Wireless Federated Fine-Tuning for Large-Scale AI Models [13.742950928229078]
Low-Rank Adaptation (LoRA) addresses these issues by training compact, low-rank matrices instead of fully fine-tuning large models. This paper introduces a wireless federated LoRA fine-tuning framework that optimize both learning performance and communication efficiency.
arXiv Detail & Related papers (2025-05-01T06:15:38Z)
Data-Free Federated Class Incremental Learning with Diffusion-Based Generative Memory [27.651921957220004]
We introduce a novel data-free federated class incremental learning framework with diffusion-based generative memory (DFedDGM) We design a new balanced sampler to help train the diffusion models to alleviate the common non-IID problem in FL. We also introduce an entropy-based sample filtering technique from an information theory perspective to enhance the quality of generative samples.
arXiv Detail & Related papers (2024-05-22T20:59:18Z)
Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing. Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image. To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z)
Misalignment-Robust Frequency Distribution Loss for Image Transformation [51.0462138717502]
This paper aims to address a common challenge in deep learning-based image transformation methods, such as image enhancement and super-resolution. We introduce a novel and simple Frequency Distribution Loss (FDL) for computing distribution distance within the frequency domain. Our method is empirically proven effective as a training constraint due to the thoughtful utilization of global information in the frequency domain.
arXiv Detail & Related papers (2024-02-28T09:27:41Z)
Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes. Deep generative models, including diffusion models, are biased towards classes with abundant training images. We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z)
X-Transfer: A Transfer Learning-Based Framework for GAN-Generated Fake Image Detection [33.31312811230408]
misuse of GANs for generating deceptive images, such as face replacement, raises significant security concerns. This paper introduces a novel GAN-generated image detection algorithm called X-Transfer. It enhances transfer learning by utilizing two neural networks that employ interleaved parallel gradient transmission.
arXiv Detail & Related papers (2023-10-07T01:23:49Z)
Breaking Through the Haze: An Advanced Non-Homogeneous Dehazing Method based on Fast Fourier Convolution and ConvNeXt [14.917290578644424]
Haze usually leads to deteriorated images with low contrast, color shift and structural distortion. We propose a novel two branch network that leverages 2D discrete wavelete transform (DWT), fast Fourier convolution (FFC) residual block and a pretrained ConvNeXt model. Our model is able to effectively explore global contextual information and produce images with better perceptual quality.
arXiv Detail & Related papers (2023-05-08T02:59:02Z)
Multimodal Data Augmentation for Image Captioning using Diffusion Models [12.221685807426264]
We propose a data augmentation method, leveraging a text-to-image model called Stable Diffusion, to expand the training set. Experiments on the MS COCO dataset demonstrate the advantages of our approach over several benchmark methods. Further improvement regarding the training efficiency and effectiveness can be obtained after intentionally filtering the generated data.
arXiv Detail & Related papers (2023-05-03T01:57:33Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)
Feature Quantization Improves GAN Training [126.02828112121874]
Feature Quantization (FQ) for the discriminator embeds both true and fake data samples into a shared discrete space. Our method can be easily plugged into existing GAN models, with little computational overhead in training.
arXiv Detail & Related papers (2020-04-05T04:06:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.