AutoLoRa: A Parameter-Free Automated Robust Fine-Tuning Framework
- URL: http://arxiv.org/abs/2310.01818v1
- Date: Tue, 3 Oct 2023 06:16:03 GMT
- Title: AutoLoRa: A Parameter-Free Automated Robust Fine-Tuning Framework
- Authors: Xilie Xu, Jingfeng Zhang, Mohan Kankanhalli
- Abstract summary: Robust Fine-Tuning (RFT) is a low-cost strategy to obtain adversarial robustness in downstream applications.
This paper uncovers an issue with the existing RFT, where optimizing both adversarial and natural objectives through the feature extractor (FE) yields significantly divergent gradient directions.
We propose a low-rank (LoRa) branch that disentangles RFT into two distinct components: optimizing natural objectives via the LoRa branch and adversarial objectives via the FE.
- Score: 13.471022394534465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robust Fine-Tuning (RFT) is a low-cost strategy to obtain adversarial
robustness in downstream applications, without requiring a lot of computational
resources and collecting significant amounts of data. This paper uncovers an
issue with the existing RFT, where optimizing both adversarial and natural
objectives through the feature extractor (FE) yields significantly divergent
gradient directions. This divergence introduces instability in the optimization
process, thereby hindering the attainment of adversarial robustness and
rendering RFT highly sensitive to hyperparameters. To mitigate this issue, we
propose a low-rank (LoRa) branch that disentangles RFT into two distinct
components: optimizing natural objectives via the LoRa branch and adversarial
objectives via the FE. Besides, we introduce heuristic strategies for
automating the scheduling of the learning rate and the scalars of loss terms.
Extensive empirical evaluations demonstrate that our proposed automated RFT
disentangled via the LoRa branch (AutoLoRa) achieves new state-of-the-art
results across a range of downstream tasks. AutoLoRa holds significant
practical utility, as it automatically converts a pre-trained FE into an
adversarially robust model for downstream tasks without the need for searching
hyperparameters.
Related papers
- TopoCurate:Modeling Interaction Topology for Tool-Use Agent Training [53.93696896939915]
Training tool-use agents typically rely on Supervised Fine-Tuning (SFT) on successful trajectories and Reinforcement Learning (RL) on pass-rate-selected tasks.<n>We propose TopoCurate, an interaction-aware framework that projects multi-trial rollouts from the same task into a unified semantic quotient topology.<n>TopoCurate achieves consistent gains of 4.2% (SFT) and 6.9% (RL) over state-of-the-art baselines.
arXiv Detail & Related papers (2026-03-02T10:38:54Z) - ODAR: Principled Adaptive Routing for LLM Reasoning via Active Inference [60.958331943869126]
ODAR-Expert is an adaptive routing framework that optimize the accuracy-efficiency trade-off via principled resource allocation.<n>We show strong and consistent gains, including 98.2% accuracy on MATH and 54.8% on Humanity's Last Exam.
arXiv Detail & Related papers (2026-02-27T05:22:01Z) - Relatron: Automating Relational Machine Learning over Relational Databases [50.94254514286021]
We present a study that unifies RDL and DFS in a shared design space and conducts architecture-centric searches across diverse RDB tasks.<n>Our analysis yields three key findings: (1) RDL does not consistently outperform DFS, with performance being highly task-dependent; (2) no single architecture dominates across tasks, underscoring the need for task-aware model selection; and accuracy is an unreliable guide for choice architecture.
arXiv Detail & Related papers (2026-02-26T02:45:22Z) - Guided by Trajectories: Repairing and Rewarding Tool-Use Trajectories for Tool-Integrated Reasoning [65.10602992874787]
AutoTraj is a framework that automatically learns TIR by repairing and rewarding tool-use trajectories.<n> Experiments on real-world benchmarks demonstrate the effectiveness of AutoTraj.
arXiv Detail & Related papers (2026-01-30T14:42:04Z) - FRoD: Full-Rank Efficient Fine-Tuning with Rotational Degrees for Fast Convergence [20.138989330054955]
We propose FRoD, a novel fine-tuning method that combines hierarchical joint decomposition with rotational degrees of freedom.<n>On 20 benchmarks spanning vision, reasoning, and language understanding, FRoD matches full model fine-tuning in accuracy, while using only 1.72% of trainable parameters under identical training budgets.
arXiv Detail & Related papers (2025-12-29T14:13:45Z) - RIFT: A Scalable Methodology for LLM Accelerator Fault Assessment using Reinforcement Learning [0.0]
RIFT (Reinforcement Learning-guided Intelligent Fault Targeting) is a scalable framework that automates the discovery of minimal, high-impact fault scenarios.<n>RIFT transforms the complex search for worst-case faults into a sequential decision-making problem.
arXiv Detail & Related papers (2025-12-10T17:07:19Z) - SkipVAR: Accelerating Visual Autoregressive Modeling via Adaptive Frequency-Aware Skipping [30.85025293160079]
High-frequency components, or later steps, in the generation process contribute disproportionately to inference latency.<n>We identify two primary sources of inefficiency: step redundancy and unconditional branch redundancy.<n>We propose an automatic step-skipping strategy that selectively omits unnecessary generation steps to improve efficiency.
arXiv Detail & Related papers (2025-06-10T15:35:29Z) - Sculpting Features from Noise: Reward-Guided Hierarchical Diffusion for Task-Optimal Feature Transformation [18.670626228472877]
DIFFT redefines Feature Transformation as a reward-guided generative task.<n>It produces structured, discrete features, preserving intra-feature dependencies while allowing parallel inter-feature generation.<n>It consistently outperforms state-of-the-art baselines in predictive accuracy and robustness, with significantly lower training and inference times.
arXiv Detail & Related papers (2025-05-21T06:18:42Z) - The Larger the Merrier? Efficient Large AI Model Inference in Wireless Edge Networks [56.37880529653111]
The demand for large computation model (LAIM) services is driving a paradigm shift from traditional cloud-based inference to edge-based inference for low-latency, privacy-preserving applications.<n>In this paper, we investigate the LAIM-inference scheme, where a pre-trained LAIM is pruned and partitioned into on-device and on-server sub-models for deployment.
arXiv Detail & Related papers (2025-05-14T08:18:55Z) - A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning [0.6906005491572401]
We propose a method for allocating expert numbers based on parameter sensitivity LoRA-SMoE.<n> Experimental results demonstrate that our LoRA-SMoE approach can enhance model performance while reducing the number of trainable parameters.
arXiv Detail & Related papers (2025-05-06T13:22:46Z) - PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning [54.99373314906667]
Self-supervised representation learning for point cloud has demonstrated effectiveness in improving pre-trained model performance across diverse tasks.
As pre-trained models grow in complexity, fully fine-tuning them for downstream applications demands substantial computational and storage resources.
We propose PointLoRA, a simple yet effective method that combines low-rank adaptation (LoRA) with multi-scale token selection to efficiently fine-tune point cloud models.
arXiv Detail & Related papers (2025-04-22T16:41:21Z) - AdaptSR: Low-Rank Adaptation for Efficient and Scalable Real-World Super-Resolution [50.584551250242235]
AdaptSR is a low-rank adaptation framework that efficiently repurposes bi-cubic-trained SR models for real-world tasks.
Our experiments demonstrate that AdaptSR outperforms GAN and diffusion-based SR methods by up to 4 dB in PSNR and 2% in perceptual scores on real SR benchmarks.
arXiv Detail & Related papers (2025-03-10T18:03:18Z) - Fractional Correspondence Framework in Detection Transformer [13.388933240897492]
The Detection Transformer (DETR) has significantly simplified the matching process in object detection tasks.
This algorithm facilitates optimal one-to-one matching of predicted bounding boxes to ground-truth annotations during training.
We propose a flexible matching strategy that captures the cost of aligning predictions with ground truths to find the most accurate correspondences.
arXiv Detail & Related papers (2025-03-06T05:29:20Z) - Hyper-parameter Optimization for Federated Learning with Step-wise Adaptive Mechanism [0.48342038441006796]
Federated Learning (FL) is a decentralized learning approach that protects sensitive information by utilizing local model parameters rather than sharing clients' raw datasets.
This paper investigates the deployment and integration of two lightweight Hyper- Optimization (HPO) tools, Raytune and Optuna, within the context of FL settings.
To this end, both local and global feedback mechanisms are integrated to limit the search space and expedite the HPO process.
arXiv Detail & Related papers (2024-11-19T05:49:00Z) - Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models.
LoRA often under performs when compared to full- parameter fine-tuning.
We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z) - Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures [21.18741772731095]
Zeroth-order (ZO) algorithms offer a promising alternative by approximating gradients using finite differences of function values.
Existing ZO methods struggle to capture the low-rank gradient structure common in LLM fine-tuning, leading to suboptimal performance.
This paper proposes a low-rank ZO algorithm (LOZO) that effectively captures this structure in LLMs.
arXiv Detail & Related papers (2024-10-10T08:10:53Z) - Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape [52.98187034726091]
Low-Rank Adaptation (LoRA) is an efficient way to fine-tune models by optimizing only a low-rank matrix.
A solution that appears flat in the LoRA space may exist sharp directions in the full parameter space, potentially harming generalization performance.
We propose Flat-LoRA, an efficient approach that seeks a low-rank adaptation located in a flat region of the full parameter space.
arXiv Detail & Related papers (2024-09-22T11:24:10Z) - Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning [65.31677646659895]
This paper focuses on the concept of task-specific directions (TSDs)-critical for transitioning large models from pretrained states to task-specific enhancements in PEFT.
We introduce a novel approach, LoRA-Dash, which aims to maximize the impact of TSDs during the fine-tuning process, thereby enhancing model performance on targeted tasks.
arXiv Detail & Related papers (2024-09-02T08:10:51Z) - LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models [7.926974917872204]
LoRA-SP is a novel approach utilizing randomized half-selective parameter freezing.
LoRA-SP significantly reduces computational and memory requirements without compromising model performance.
arXiv Detail & Related papers (2024-02-28T06:50:10Z) - Low-Rank Representations Meets Deep Unfolding: A Generalized and
Interpretable Network for Hyperspectral Anomaly Detection [41.50904949744355]
Current hyperspectral anomaly detection (HAD) benchmark datasets suffer from low resolution, simple background, and small size of the detection data.
These factors also limit the performance of the well-known low-rank representation (LRR) models in terms of robustness.
We build a new set of HAD benchmark datasets for improving the robustness of the HAD algorithm in complex scenarios, AIR-HAD for short.
arXiv Detail & Related papers (2024-02-23T14:15:58Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer [60.31021888394358]
Unsupervised Domain Adaptation (UDA) can effectively address domain gap issues in real-world image Super-Resolution (SR)
We propose a SOurce-free Domain Adaptation framework for image SR (SODA-SR) to address this issue, i.e., adapt a source-trained model to a target domain with only unlabeled target data.
arXiv Detail & Related papers (2023-03-31T03:14:44Z) - Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources.
The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z) - Fast Distributionally Robust Learning with Variance Reduced Min-Max
Optimization [85.84019017587477]
Distributionally robust supervised learning is emerging as a key paradigm for building reliable machine learning systems for real-world applications.
Existing algorithms for solving Wasserstein DRSL involve solving complex subproblems or fail to make use of gradients.
We revisit Wasserstein DRSL through the lens of min-max optimization and derive scalable and efficiently implementable extra-gradient algorithms.
arXiv Detail & Related papers (2021-04-27T16:56:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.