GPTA: Generative Prompt Tuning Assistant for Synergistic Downstream Neural Network Enhancement with LLMs
- URL: http://arxiv.org/abs/2404.00189v1
- Date: Fri, 29 Mar 2024 23:04:04 GMT
- Title: GPTA: Generative Prompt Tuning Assistant for Synergistic Downstream Neural Network Enhancement with LLMs
- Authors: Xiao Liu, Jiawei Zhang,
- Abstract summary: This study introduces GPTA, a Large Language Model assistance training framework, that enhances the training of downstream task models via prefix prompt.
By minimizing data exposure to LLM, the framework addresses the security and legal challenges of applying LLM in downstream task model training.
- Score: 11.572835837392867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study introduces GPTA, a Large Language Model assistance training framework, that enhances the training of downstream task models via prefix prompt. By minimizing data exposure to LLM, the framework addresses the security and legal challenges of applying LLM in downstream task model training. GPTA utilizes a new synergistic training approach, optimizing the downstream models with parameter gradients and LLMs with the novel ``dialogue gradient''. The framework not only demonstrates significant improvements in model performance across six NLP benchmark datasets, but also reduces overfitting in low-resource scenarios effectively. The detailed analyses further validate that our pioneer framework provides a cost-efficient and adaptive method for downstream task model training with LLM support.
Related papers
- Meta-Learning Adaptable Foundation Models [37.458141335750696]
We introduce a meta-learning framework infused with PEFT in this intermediate retraining stage to learn a model that can be easily adapted to unseen tasks.
In this setting, we demonstrate the suboptimality of standard retraining for finding an adaptable set of parameters.
We then apply these theoretical insights to retraining the RoBERTa model to predict the continuation of conversations within the ConvAI2 dataset.
arXiv Detail & Related papers (2024-10-29T17:24:18Z) - Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs [75.11449420928139]
Fine-tuning Large Language Models (LLMs) has become a crucial technique for adapting pre-trained models to downstream tasks.
Low-Rank Adaptation (LoRA) has emerged as a promising solution, but there exists a gap between the practical performance of low-rank adaptations and its theoretical optimum.
We propose eXtreme Gradient Boosting LoRA, a novel framework that bridges this gap by leveraging the power of ensemble learning.
arXiv Detail & Related papers (2024-10-25T17:07:13Z) - A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs [74.35290684163718]
A primary challenge in large language model (LLM) development is their onerous pre-training cost.
This paper explores a promising paradigm to improve LLM pre-training efficiency and quality by leveraging a small language model (SLM)
arXiv Detail & Related papers (2024-10-24T14:31:52Z) - Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification [76.14641982122696]
We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control.
We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task.
arXiv Detail & Related papers (2024-10-07T23:38:58Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities [0.35998666903987897]
This report examines the fine-tuning of Large Language Models (LLMs)
It outlines the historical evolution of LLMs from traditional Natural Language Processing (NLP) models to their pivotal role in AI.
The report introduces a structured seven-stage pipeline for fine-tuning LLMs.
arXiv Detail & Related papers (2024-08-23T14:48:02Z) - CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models [68.64605538559312]
In this paper, we analyze the MLLM instruction tuning from both theoretical and empirical perspectives.
Inspired by our findings, we propose a measurement to quantitatively evaluate the learning balance.
In addition, we introduce an auxiliary loss regularization method to promote updating of the generation distribution of MLLMs.
arXiv Detail & Related papers (2024-07-29T23:18:55Z) - PAFT: A Parallel Training Paradigm for Effective LLM Fine-Tuning [17.73193523921637]
Large language models (LLMs) have shown remarkable abilities in diverse natural language processing (NLP) tasks.
LLMs generally undergo supervised fine-tuning (SFT) followed by preference alignment to be usable in downstream applications.
This paper introduces PAFT, a new PArallel training paradigm for effective LLM Fine-Tuning.
arXiv Detail & Related papers (2024-06-25T20:11:37Z) - From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems [59.40480894948944]
Large language model (LLM) empowered agents are able to solve decision-making problems in the physical world.
Under this model, the LLM Planner navigates a partially observable Markov decision process (POMDP) by iteratively generating language-based subgoals via prompting.
We prove that the pretrained LLM Planner effectively performs Bayesian aggregated imitation learning (BAIL) through in-context learning.
arXiv Detail & Related papers (2024-05-30T09:42:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.