LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection
- URL: http://arxiv.org/abs/2505.03793v3
- Date: Sat, 31 May 2025 16:36:42 GMT
- Title: LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection
- Authors: Xinyue Zeng, Haohui Wang, Junhong Lin, Jun Wu, Tyler Cody, Dawei Zhou,
- Abstract summary: Open-sourced Large Language Models (LLMs) and diverse downstream tasks require efficient model selection.<n>We propose a novel theoretical framework that provides a proper lens to assess the generalization capabilities of LLMs.<n>In particular, we first derive a PAC-Bayesian Generalization Bound that unveils fine-tuning dynamics of LLMs.<n>We then introduce LENSLLM, a Neural Tangent Kernel (NTK)-based Rectified Scaling Model that enables accurate performance predictions.
- Score: 11.353302879735862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The proliferation of open-sourced Large Language Models (LLMs) and diverse downstream tasks necessitates efficient model selection, given the impracticality of fine-tuning all candidates due to computational constraints. Despite the recent advances in LLM selection, a fundamental research question largely remains nascent: how can we model the dynamic behaviors of LLMs during fine-tuning, thereby enhancing our understanding of their generalization performance across diverse downstream tasks? In this work, we propose a novel theoretical framework that provides a proper lens to assess the generalization capabilities of LLMs, thereby enabling accurate and efficient LLM selection for downstream applications. In particular, we first derive a PAC-Bayesian Generalization Bound that unveils fine-tuning dynamics of LLMs and then introduce LENSLLM, a Neural Tangent Kernel (NTK)-based Rectified Scaling Model that enables accurate performance predictions across diverse tasks while maintaining computational efficiency. Extensive empirical results on 3 large-scale benchmarks demonstrate that our model achieves up to 91.1% accuracy and reduces up to 88.5% computational cost in LLM selection, outperforming 5 state-of-the-art methods. We open-source our proposed LENSLLM model and corresponding results at LensLLM.io.
Related papers
- Efficient Model Selection for Time Series Forecasting via LLMs [52.31535714387368]
We propose to leverage Large Language Models (LLMs) as a lightweight alternative for model selection.<n>Our method eliminates the need for explicit performance matrices by utilizing the inherent knowledge and reasoning capabilities of LLMs.
arXiv Detail & Related papers (2025-04-02T20:33:27Z) - LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.<n>LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.<n>Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - LLM-Powered Preference Elicitation in Combinatorial Assignment [17.367432304040662]
We study the potential of large language models (LLMs) as proxies for humans to simplify preference elicitation (PE) in assignment.<n>We propose a framework for LLM proxies that can work in tandem with SOTA ML-powered preference elicitation schemes.<n>We experimentally evaluate the efficiency of LLM proxies against human queries in the well-studied course allocation domain.
arXiv Detail & Related papers (2025-02-14T17:12:20Z) - Rational Tuning of LLM Cascades via Probabilistic Modeling [0.9208007322096532]
We present a probabilistic model for the joint performance distribution of a sequence of large language models (LLMs)<n>Compared to selecting confidence thresholds using grid search, our model significantly improves runtime scaling with respect to the length of the cascade and the desired resolution of the cost-error curve.
arXiv Detail & Related papers (2025-01-16T07:58:33Z) - Densing Law of LLMs [81.06644243978101]
Large Language Models (LLMs) have emerged as a milestone in artificial intelligence, and their performance can improve as the model size increases.<n>This paper introduces the concept of textitcapacity density'' as a new metric to evaluate the quality of the LLMs across different scales.
arXiv Detail & Related papers (2024-12-05T16:31:13Z) - EVOLvE: Evaluating and Optimizing LLMs For Exploration [76.66831821738927]
Large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty.
We measure LLMs' (in)ability to make optimal decisions in bandits, a state-less reinforcement learning setting relevant to many applications.
Motivated by the existence of optimal exploration algorithms, we propose efficient ways to integrate this algorithmic knowledge into LLMs.
arXiv Detail & Related papers (2024-10-08T17:54:03Z) - The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities [0.35998666903987897]
This report examines the fine-tuning of Large Language Models (LLMs)
It outlines the historical evolution of LLMs from traditional Natural Language Processing (NLP) models to their pivotal role in AI.
The report introduces a structured seven-stage pipeline for fine-tuning LLMs.
arXiv Detail & Related papers (2024-08-23T14:48:02Z) - SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models [8.558834738072363]
Large language models (LLMs) have been widely adopted due to their remarkable performance across various applications.<n>These individual LLMs show limitations in generalization and performance on complex tasks due to inherent training biases, model size constraints, and the quality or diversity of pre-training datasets.<n>We introduce SelectLLM, which efficiently directs input queries to the most suitable subset of LLMs from a large pool.
arXiv Detail & Related papers (2024-08-16T06:11:21Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning.
We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z) - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models [52.98743860365194]
We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN)
At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself.
This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents.
arXiv Detail & Related papers (2024-01-02T18:53:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.