Related papers: Efficient Hyper-Parameter Search for LoRA via Language-aided Bayesian Optimization

Efficient Hyper-Parameter Search for LoRA via Language-aided Bayesian Optimization

URL: http://arxiv.org/abs/2602.11171v1
Date: Mon, 19 Jan 2026 08:48:03 GMT
Title: Efficient Hyper-Parameter Search for LoRA via Language-aided Bayesian Optimization
Authors: Baek Seong-Eun, Lee Jung-Mok, Kim Sung-Bin, Tae-Hyun Oh,
Abstract summary: Fine-tuning Large Language Models (LLMs) with Low-Rank Adaptation (LoRA) enables resource-efficient personalization or specialization.<n>We propose a framework that integrates the domain knowledge of pre-trained LLMs into Bayesian Optimization.
Score: 27.47526031899076
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fine-tuning Large Language Models (LLMs) with Low-Rank Adaptation (LoRA) enables resource-efficient personalization or specialization, but it comes at the expense of additional hyperparameter tuning. Although LoRA makes fine-tuning efficient, it is highly sensitive to the choice of hyperparameters, and exhaustive hyperparameter search is still computationally very demanding. To address these challenges, we propose a framework that integrates the domain knowledge of pre-trained LLMs into Bayesian Optimization (BO) to efficiently search for LoRA hyperparameters. To leverage the informed knowledge of LLMs, we repurpose LLMs as a discrete-to-continuous mapping to link the hyperparameters and their domain knowledge with a continuous vector space, where BO is conducted. We design and control the mapping by language prompting, where we provide a domain-aware textual prompt describing the relationships among hyperparameters and their respective roles; thereby, we explicitly inject domain knowledge about LoRA into the LLM in natural language. Also, we model the residual information that is hard to linguistically describe in the prompt with an additional learnable token. This aids BO to sample more high-performing hyperparameters. In addition, by leveraging the observation of the strong correlation between the respective performance obtained from full and subset training datasets in LoRA training regimes, we introduce proxy training and evaluation with a data subset. This further increases the efficiency of our method. We demonstrate that our hyperparameter found with only about 30 iterations achieves more than 20% performance improvement over standard hyperparameters found from about 45,000 combinations.

Related papers

SHINE: A Scalable In-Context Hypernetwork for Mapping Context to LoRA in a Single Pass [55.28352410490407]
SHINE is a scalable hypernetwork that can map diverse meaningful contexts into high-quality LoRA adapters for large language models (LLM)<n>We introduce a pretraining and instruction fine-tuning pipeline, and train our hypernetwork to generate high quality LoRA adapters in a single forward pass.<n>Our work achieves outstanding results on various tasks, greatly saves time, computation and memory costs compared to SFT-based LLM adaptation, and shows great potential for scaling.
arXiv Detail & Related papers (2026-02-06T03:40:31Z)
HyperAdaLoRA: Accelerating LoRA Rank Allocation During Training via Hypernetworks without Sacrificing Performance [27.391727025825546]
Low-Rank Adaptation (LoRA) has emerged as a promising approach to fine-tuning large language models.<n>We propose HyperAdaLoRA, a novel framework that accelerates the convergence of AdaLoRA by leveraging a hypernetwork.<n>Our method achieves faster convergence without sacrificing performance.
arXiv Detail & Related papers (2025-10-03T00:15:59Z)
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights [75.83625828306839]
textbfDrag-and-Drop LLMs (textitDnD) eliminates per-task training by mapping a handful of unlabeled task prompts directly to LoRA weight updates.<n>A lightweight text encoder distills each prompt batch into condition embeddings, which are then transformed by a cascaded hyper-convolutional decoder into the full set of LoRA matrices.
arXiv Detail & Related papers (2025-06-19T15:38:21Z)
A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning [0.6906005491572401]
We propose a method for allocating expert numbers based on parameter sensitivity LoRA-SMoE.<n> Experimental results demonstrate that our LoRA-SMoE approach can enhance model performance while reducing the number of trainable parameters.
arXiv Detail & Related papers (2025-05-06T13:22:46Z)
PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning [54.99373314906667]
Self-supervised representation learning for point cloud has demonstrated effectiveness in improving pre-trained model performance across diverse tasks.<n>As pre-trained models grow in complexity, fully fine-tuning them for downstream applications demands substantial computational and storage resources.<n>We propose PointLoRA, a simple yet effective method that combines low-rank adaptation (LoRA) with multi-scale token selection to efficiently fine-tune point cloud models.
arXiv Detail & Related papers (2025-04-22T16:41:21Z)
Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning? [45.58422897857411]
This work explores the use of large language models (LLMs) for hyperparameter optimization by fine-tuning a parameter-efficient version of Code Llama using LoRA.<n>Our approach achieves competitive or superior Root Mean Square Error (RMSE) while substantially reducing computational overhead.<n>Results demonstrate that LLM-based optimization not only rivals established Bayesian methods like Tree-structured Parzen Estimators (TPE) but also accelerates tuning for real-world applications requiring perceptual quality and low-latency processing.
arXiv Detail & Related papers (2025-04-08T13:15:47Z)
In-Context Meta LoRA Generation [61.690065588534296]
Low-rank Adaptation (LoRA) has demonstrated remarkable capabilities for task specific fine-tuning.<n>We propose In-Context Meta LoRA (ICM-LoRA), a novel approach that efficiently achieves task-specific customization of large language models.<n>ICM-LoRA enables more accurate LoRA parameter reconstruction than current parameter reconstruction methods.
arXiv Detail & Related papers (2025-01-29T13:12:01Z)
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning [105.11844150736536]
Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models. We propose a new method called MoRA, which employs a square matrix to achieve high-rank updating while maintaining the same number of trainable parameters. Our method outperforms LoRA on memory-intensive tasks and achieves comparable performance on other tasks.
arXiv Detail & Related papers (2024-05-20T15:48:32Z)
Hyperparameter Optimization for Large Language Model Instruction-Tuning [6.743825167463901]
We study the whole pipeline of performing fine-tuning and validation on a pre-trained LLM as a blackbox. We efficiently explore the space of hyper parameters with the nomad algorithm, achieving a boost in performance and human alignment of the tuned model.
arXiv Detail & Related papers (2023-12-01T22:03:12Z)
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning. We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.