Related papers: AutoLoRa: A Parameter-Free Automated Robust Fine-Tuning Framework

AutoLoRa: A Parameter-Free Automated Robust Fine-Tuning Framework

URL: http://arxiv.org/abs/2310.01818v1
Date: Tue, 3 Oct 2023 06:16:03 GMT
Title: AutoLoRa: A Parameter-Free Automated Robust Fine-Tuning Framework
Authors: Xilie Xu, Jingfeng Zhang, Mohan Kankanhalli
Abstract summary: Robust Fine-Tuning (RFT) is a low-cost strategy to obtain adversarial robustness in downstream applications. This paper uncovers an issue with the existing RFT, where optimizing both adversarial and natural objectives through the feature extractor (FE) yields significantly divergent gradient directions. We propose a low-rank (LoRa) branch that disentangles RFT into two distinct components: optimizing natural objectives via the LoRa branch and adversarial objectives via the FE.
Score: 13.471022394534465
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robust Fine-Tuning (RFT) is a low-cost strategy to obtain adversarial robustness in downstream applications, without requiring a lot of computational resources and collecting significant amounts of data. This paper uncovers an issue with the existing RFT, where optimizing both adversarial and natural objectives through the feature extractor (FE) yields significantly divergent gradient directions. This divergence introduces instability in the optimization process, thereby hindering the attainment of adversarial robustness and rendering RFT highly sensitive to hyperparameters. To mitigate this issue, we propose a low-rank (LoRa) branch that disentangles RFT into two distinct components: optimizing natural objectives via the LoRa branch and adversarial objectives via the FE. Besides, we introduce heuristic strategies for automating the scheduling of the learning rate and the scalars of loss terms. Extensive empirical evaluations demonstrate that our proposed automated RFT disentangled via the LoRa branch (AutoLoRa) achieves new state-of-the-art results across a range of downstream tasks. AutoLoRa holds significant practical utility, as it automatically converts a pre-trained FE into an adversarially robust model for downstream tasks without the need for searching hyperparameters.

Related papers

SkipVAR: Accelerating Visual Autoregressive Modeling via Adaptive Frequency-Aware Skipping [30.85025293160079]
High-frequency components, or later steps, in the generation process contribute disproportionately to inference latency.<n>We identify two primary sources of inefficiency: step redundancy and unconditional branch redundancy.<n>We propose an automatic step-skipping strategy that selectively omits unnecessary generation steps to improve efficiency.
arXiv Detail & Related papers (2025-06-10T15:35:29Z)
Sculpting Features from Noise: Reward-Guided Hierarchical Diffusion for Task-Optimal Feature Transformation [18.670626228472877]
DIFFT redefines Feature Transformation as a reward-guided generative task.<n>It produces structured, discrete features, preserving intra-feature dependencies while allowing parallel inter-feature generation.<n>It consistently outperforms state-of-the-art baselines in predictive accuracy and robustness, with significantly lower training and inference times.
arXiv Detail & Related papers (2025-05-21T06:18:42Z)
The Larger the Merrier? Efficient Large AI Model Inference in Wireless Edge Networks [56.37880529653111]
The demand for large computation model (LAIM) services is driving a paradigm shift from traditional cloud-based inference to edge-based inference for low-latency, privacy-preserving applications.<n>In this paper, we investigate the LAIM-inference scheme, where a pre-trained LAIM is pruned and partitioned into on-device and on-server sub-models for deployment.
arXiv Detail & Related papers (2025-05-14T08:18:55Z)
A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning [0.6906005491572401]
We propose a method for allocating expert numbers based on parameter sensitivity LoRA-SMoE.<n> Experimental results demonstrate that our LoRA-SMoE approach can enhance model performance while reducing the number of trainable parameters.
arXiv Detail & Related papers (2025-05-06T13:22:46Z)
PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning [54.99373314906667]
Self-supervised representation learning for point cloud has demonstrated effectiveness in improving pre-trained model performance across diverse tasks. As pre-trained models grow in complexity, fully fine-tuning them for downstream applications demands substantial computational and storage resources. We propose PointLoRA, a simple yet effective method that combines low-rank adaptation (LoRA) with multi-scale token selection to efficiently fine-tune point cloud models.
arXiv Detail & Related papers (2025-04-22T16:41:21Z)
AdaptSR: Low-Rank Adaptation for Efficient and Scalable Real-World Super-Resolution [50.584551250242235]
AdaptSR is a low-rank adaptation framework that efficiently repurposes bi-cubic-trained SR models for real-world tasks. Our experiments demonstrate that AdaptSR outperforms GAN and diffusion-based SR methods by up to 4 dB in PSNR and 2% in perceptual scores on real SR benchmarks.
arXiv Detail & Related papers (2025-03-10T18:03:18Z)
Fractional Correspondence Framework in Detection Transformer [13.388933240897492]
The Detection Transformer (DETR) has significantly simplified the matching process in object detection tasks. This algorithm facilitates optimal one-to-one matching of predicted bounding boxes to ground-truth annotations during training. We propose a flexible matching strategy that captures the cost of aligning predictions with ground truths to find the most accurate correspondences.
arXiv Detail & Related papers (2025-03-06T05:29:20Z)
Hyper-parameter Optimization for Federated Learning with Step-wise Adaptive Mechanism [0.48342038441006796]
Federated Learning (FL) is a decentralized learning approach that protects sensitive information by utilizing local model parameters rather than sharing clients' raw datasets. This paper investigates the deployment and integration of two lightweight Hyper- Optimization (HPO) tools, Raytune and Optuna, within the context of FL settings. To this end, both local and global feedback mechanisms are integrated to limit the search space and expedite the HPO process.
arXiv Detail & Related papers (2024-11-19T05:49:00Z)
Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models. LoRA often under performs when compared to full- parameter fine-tuning. We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z)
Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures [21.18741772731095]
Zeroth-order (ZO) algorithms offer a promising alternative by approximating gradients using finite differences of function values. Existing ZO methods struggle to capture the low-rank gradient structure common in LLM fine-tuning, leading to suboptimal performance. This paper proposes a low-rank ZO algorithm (LOZO) that effectively captures this structure in LLMs.
arXiv Detail & Related papers (2024-10-10T08:10:53Z)
Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape [52.98187034726091]
Low-Rank Adaptation (LoRA) is an efficient way to fine-tune models by optimizing only a low-rank matrix. A solution that appears flat in the LoRA space may exist sharp directions in the full parameter space, potentially harming generalization performance. We propose Flat-LoRA, an efficient approach that seeks a low-rank adaptation located in a flat region of the full parameter space.
arXiv Detail & Related papers (2024-09-22T11:24:10Z)
Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning [65.31677646659895]
This paper focuses on the concept of task-specific directions (TSDs)-critical for transitioning large models from pretrained states to task-specific enhancements in PEFT. We introduce a novel approach, LoRA-Dash, which aims to maximize the impact of TSDs during the fine-tuning process, thereby enhancing model performance on targeted tasks.
arXiv Detail & Related papers (2024-09-02T08:10:51Z)
LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models [7.926974917872204]
LoRA-SP is a novel approach utilizing randomized half-selective parameter freezing. LoRA-SP significantly reduces computational and memory requirements without compromising model performance.
arXiv Detail & Related papers (2024-02-28T06:50:10Z)
Low-Rank Representations Meets Deep Unfolding: A Generalized and Interpretable Network for Hyperspectral Anomaly Detection [41.50904949744355]
Current hyperspectral anomaly detection (HAD) benchmark datasets suffer from low resolution, simple background, and small size of the detection data. These factors also limit the performance of the well-known low-rank representation (LRR) models in terms of robustness. We build a new set of HAD benchmark datasets for improving the robustness of the HAD algorithm in complex scenarios, AIR-HAD for short.
arXiv Detail & Related papers (2024-02-23T14:15:58Z)
Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR) CFSR inherits the advantages of both convolution-based and transformer-based approaches. Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z)
Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer [60.31021888394358]
Unsupervised Domain Adaptation (UDA) can effectively address domain gap issues in real-world image Super-Resolution (SR) We propose a SOurce-free Domain Adaptation framework for image SR (SODA-SR) to address this issue, i.e., adapt a source-trained model to a target domain with only unlabeled target data.
arXiv Detail & Related papers (2023-03-31T03:14:44Z)
Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources. The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z)
Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization [85.84019017587477]
Distributionally robust supervised learning is emerging as a key paradigm for building reliable machine learning systems for real-world applications. Existing algorithms for solving Wasserstein DRSL involve solving complex subproblems or fail to make use of gradients. We revisit Wasserstein DRSL through the lens of min-max optimization and derive scalable and efficiently implementable extra-gradient algorithms.
arXiv Detail & Related papers (2021-04-27T16:56:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.