Related papers: Natural Language Fine-Tuning

Natural Language Fine-Tuning

URL: http://arxiv.org/abs/2412.20382v1
Date: Sun, 29 Dec 2024 07:02:45 GMT
Title: Natural Language Fine-Tuning
Authors: Jia Liu, Yue Wang, Zhiqi Lin, Min Chen, Yixue Hao, Long Hu,
Abstract summary: We introduce Natural Language Fine-Tuning (NLFT), which utilizes natural language for fine-tuning for the first time.<n>Since linguistic information is effectively utilized in NLFT, our proposed method significantly reduces training costs.<n>It markedly enhances training efficiency, comprehensively outperforming reinforcement fine-tuning algorithms in accuracy, time-saving, and resource conservation.
Score: 13.143016409660484
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large language model fine-tuning techniques typically depend on extensive labeled data, external guidance, and feedback, such as human alignment, scalar rewards, and demonstration. However, in practical application, the scarcity of specific knowledge poses unprecedented challenges to existing fine-tuning techniques. In this paper, focusing on fine-tuning tasks in specific domains with limited data, we introduce Natural Language Fine-Tuning (NLFT), which utilizes natural language for fine-tuning for the first time. By leveraging the strong language comprehension capability of the target LM, NLFT attaches the guidance of natural language to the token-level outputs. Then, saliency tokens are identified with calculated probabilities. Since linguistic information is effectively utilized in NLFT, our proposed method significantly reduces training costs. It markedly enhances training efficiency, comprehensively outperforming reinforcement fine-tuning algorithms in accuracy, time-saving, and resource conservation. Additionally, on the macro level, NLFT can be viewed as a token-level fine-grained optimization of SFT, thereby efficiently replacing the SFT process without the need for warm-up (as opposed to ReFT requiring multiple rounds of warm-up with SFT). Compared to SFT, NLFT does not increase the algorithmic complexity, maintaining O(n). Extensive experiments on the GSM8K dataset demonstrate that NLFT, with only 50 data instances, achieves an accuracy increase that exceeds SFT by 219%. Compared to ReFT, the time complexity and space complexity of NLFT are reduced by 78.27% and 92.24%, respectively. The superior technique of NLFT is paving the way for the deployment of various innovative LLM fine-tuning applications when resources are limited at network edges. Our code has been released at https://github.com/Julia-LiuJ/NLFT.

Related papers

Reinforcement Fine-Tuning Enables MLLMs Learning Novel Tasks Stably [80.36077974826865]
Post-training algorithms such as Supervised Fine-Tuning (SFT) and Reinforcement Fine-Tuning (RFT) are widely used to adapt multimodal large language models to downstream tasks.<n>We study the behavior of SFT and RFT on an open-source multimodal model, Qwen2.5-VL.<n>Our experiments reveal a sharp trade-off: SFT enables rapid task acquisition but leads to catastrophic forgetting, whereas RFT learns more slowly on novel tasks but maintains prior knowledge.
arXiv Detail & Related papers (2025-06-30T04:15:01Z)
SLearnLLM: A Self-Learning Framework for Efficient Domain-Specific Adaptation of Large Language Models [7.44035983292392]
We propose a self-learning framework for large language models (LLMs) inspired by human learning pattern.<n>This framework takes a fine-tuning (SFT) dataset in a specific domain as input.<n>We show that our method substantially reduces training time while achieving comparable improvements to those attained with full dataset fine-tuning.
arXiv Detail & Related papers (2025-05-23T04:50:54Z)
DeFTX: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer [26.0360791797671]
We introduce DeFT-X, a novel composable SFT approach that denoises the weight matrices of a pretrained model before magnitude pruning.<n>We evaluate DeFT-X on a diverse set of extremely low-resource languages for sentiment classification (NusaX) and natural language inference (AmericasNLI)
arXiv Detail & Related papers (2025-05-21T04:20:30Z)
Discriminative Finetuning of Generative Large Language Models without Reward Models and Preference Data [61.463946150106054]
Supervised fine-tuning (SFT) followed by preference optimization (PO) has become the standard for improving pretrained large language models (LLMs) We introduce Discriminative Fine-Tuning (DFT), a novel approach that eliminates the need for preference data. Our contributions include: (i) a discriminative probabilistic framework for fine-tuning LLMs by explicitly modeling the discriminative likelihood of an answer; (ii) efficient algorithms to optimize this discriminative likelihood; and (iii) extensive experiments demonstrating DFT's effectiveness, achieving performance better than SFT and comparable to if not
arXiv Detail & Related papers (2025-02-25T22:38:55Z)
Optimizing Language Models for Grammatical Acceptability: A Comparative Study of Fine-Tuning Techniques [0.0]
This study explores the fine-tuning (FT) of the Open Pre-trained Transformer (OPT-125M) for grammatical tasks using the CoLA dataset.
arXiv Detail & Related papers (2025-01-14T05:41:09Z)
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models [12.500777267361102]
We introduce a novel textbfpreference-textbforiented supervised textbffine-textbftuning approach, namely PoFT.<n>The intuition is to boost SFT by imposing a particular preference: textitfavoring the target model over aligned LLMs on the same SFT data.<n>PoFT achieves stable and consistent improvements over the SFT baselines across different training datasets and base models.
arXiv Detail & Related papers (2024-12-17T12:49:14Z)
UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function [18.54945183526789]
We introduce Unified Fine-Tuning (UFT), which integrates SFT and alignment into a single training stage. Our experimental results demonstrate that UFT outperforms SFT on instruction-tuning data alone. When combining instruction-tuning data with alignment data, UFT effectively prevents catastrophic forgetting.
arXiv Detail & Related papers (2024-10-28T18:34:25Z)
SpaFL: Communication-Efficient Federated Learning with Sparse Models and Low computational Overhead [75.87007729801304]
SpaFL: a communication-efficient FL framework is proposed to optimize sparse model structures with low computational overhead.<n>To optimize the pruning process itself, only thresholds are communicated between a server and clients instead of parameters.<n>Global thresholds are used to update model parameters by extracting aggregated parameter importance.
arXiv Detail & Related papers (2024-06-01T13:10:35Z)
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process [26.196705232699884]
We introduce Intuitive Fine-Tuning (IFT) to integrate SFT and Preference Optimization into a single process. IFT performs comparably or even superiorly to sequential recipes of SFT and some typical Preference Optimization methods. An explainable Frozen Lake game further validates the effectiveness of IFT for getting competitive policy.
arXiv Detail & Related papers (2024-05-20T08:23:28Z)
Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model [50.339632513018934]
supervised fine-tuning (SFT) has been a straightforward approach for tailoring the output of foundation large language model (LLM) to specific preferences. We critically examine this hypothesis within the scope of cross-lingual generation tasks. We introduce a novel training-free alignment method named PreTTY, which employs minimal task-related prior tokens.
arXiv Detail & Related papers (2024-04-25T17:19:36Z)
Learning to Compress Prompt in Natural Language Formats [54.06967020905763]
Large language models (LLMs) are great at processing multiple natural language processing tasks. LLMs are constrained by inferior performance with long context, slow inference speed, and the high cost of computing the results. This work aims to compress lengthy prompts in the form of natural language with LLM transferability.
arXiv Detail & Related papers (2024-02-28T20:41:21Z)
LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning Language Models [14.087415157225715]
Fine-tuning Large Language Models (LLMs) adapts a trained model to specific downstream tasks. Supervised Fine-Tuning (SFT) is a common approach, where an LLM is trained to produce desired answers. This paper introduces an alternative to SFT called Natural Language Feedback for Finetuning LLMs (LaFFi)
arXiv Detail & Related papers (2023-12-31T21:18:16Z)
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes [53.4856038354195]
Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions. FedKSeed employs zeroth-order optimization with a finite set of random seeds. It significantly reduces transmission requirements between the server and clients to just a few random seeds.
arXiv Detail & Related papers (2023-12-11T13:03:21Z)
Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization [71.87335804334616]
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. The training process of Large Language Models (LLMs) generally incurs the update of significant parameters. This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
arXiv Detail & Related papers (2023-10-23T16:37:59Z)
AsySQN: Faster Vertical Federated Learning Algorithms with Better Computation Resource Utilization [159.75564904944707]
We propose an asynchronous quasi-Newton (AsySQN) framework for vertical federated learning (VFL) The proposed algorithms make descent steps scaled by approximate without calculating the inverse Hessian matrix explicitly. We show that the adopted asynchronous computation can make better use of the computation resource.
arXiv Detail & Related papers (2021-09-26T07:56:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.