LoBaSS: Gauging Learnability in Supervised Fine-tuning Data
- URL: http://arxiv.org/abs/2310.13008v1
- Date: Mon, 16 Oct 2023 07:26:24 GMT
- Title: LoBaSS: Gauging Learnability in Supervised Fine-tuning Data
- Authors: Haotian Zhou, Tingkai Liu, Qianli Ma, Jianbo Yuan, Pengfei Liu, Yang
You and Hongxia Yang
- Abstract summary: Supervised Fine-Tuning (SFT) serves as a crucial phase in aligning Large Language Models (LLMs) to specific task prerequisites.
We introduce a new dimension in SFT data selection: learnability.
We present the Loss Based SFT Data Selection (LoBaSS) method, utilizing data learnability as the principal criterion for the selection SFT data.
- Score: 64.27898739929734
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supervised Fine-Tuning (SFT) serves as a crucial phase in aligning Large
Language Models (LLMs) to specific task prerequisites. The selection of
fine-tuning data profoundly influences the model's performance, whose principle
is traditionally grounded in data quality and distribution. In this paper, we
introduce a new dimension in SFT data selection: learnability. This new
dimension is motivated by the intuition that SFT unlocks capabilities acquired
by a LLM during the pretraining phase. Given that different pretrained models
have disparate capabilities, the SFT data appropriate for one may not suit
another. Thus, we introduce the term learnability to define the suitability of
data for effective learning by the model. We present the Loss Based SFT Data
Selection (LoBaSS) method, utilizing data learnability as the principal
criterion for the selection SFT data. This method provides a nuanced approach,
allowing the alignment of data selection with inherent model capabilities,
ensuring optimal compatibility and learning efficiency. In experimental
comparisons involving 7B and 13B models, our LoBaSS method is able to surpass
full-data fine-tuning at merely 6% of the total training data. When employing
16.7% of the data, LoBaSS harmonizes the model's capabilities across
conversational and mathematical domains, proving its efficacy and adaptability.
Related papers
- MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [16.654859430784825]
We introduce model-aware data selection with data influence models (MATES)
We fine-tune a small data influence model to approximate oracle data preference signals collected by locally probing the pretraining model.
Experiments on Pythia and the C4 dataset demonstrate that MATES significantly outperforms random data selection on extensive downstream tasks.
arXiv Detail & Related papers (2024-06-10T06:27:42Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment [65.15914284008973]
State-of-the-art techniques such as Reinforcement Learning from Human Feedback (RLHF) often consist of two stages.
1) supervised fine-tuning (SFT), where the model is fine-tuned by learning from human demonstration data.
2) Preference learning, where preference data is used to learn a reward model, which is in turn used by a reinforcement learning step to fine-tune the model.
arXiv Detail & Related papers (2024-05-28T07:11:05Z) - Comparative Analysis of Different Efficient Fine Tuning Methods of Large Language Models (LLMs) in Low-Resource Setting [0.0]
We try to push the understanding of different fine-tuning strategies for large language models (LLMs)
We compare state-of-the-art methods like vanilla fine-tuning and Pattern-Based Fine-Tuning (PBFT) on pre-trained models across two datasets, COLA and MNLI.
Our findings suggest that these alternative strategies can exhibit out-of-domain generalization comparable to that of vanilla FT and PBFT.
arXiv Detail & Related papers (2024-05-21T20:08:52Z) - LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection.
We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks.
Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z) - How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition [64.86360698067764]
This study focuses on the interplay of data composition between mathematical reasoning, code generation, and general human-aligning abilities during supervised fine-tuning.
Our experiments reveal that distinct capabilities scale differently and larger models generally show superior performance with same amount of data.
Our findings suggest the amount of composition data influences performance more than the composition ratio.
arXiv Detail & Related papers (2023-10-09T07:56:16Z) - SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models [28.764782216513037]
Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning.
We propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios.
Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning.
arXiv Detail & Related papers (2023-08-12T10:33:57Z) - Diverse Data Augmentation with Diffusions for Effective Test-time Prompt
Tuning [73.75282761503581]
We propose DiffTPT, which leverages pre-trained diffusion models to generate diverse and informative new data.
Our experiments on test datasets with distribution shifts and unseen categories demonstrate that DiffTPT improves the zero-shot accuracy by an average of 5.13%.
arXiv Detail & Related papers (2023-08-11T09:36:31Z) - SHiFT: An Efficient, Flexible Search Engine for Transfer Learning [16.289623977712086]
Transfer learning can be seen as a data- and compute-efficient alternative to training models from scratch.
We propose SHiFT, the first downstream task-aware, flexible, and efficient model search engine for transfer learning.
arXiv Detail & Related papers (2022-04-04T13:16:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.