Generative Context Distillation
- URL: http://arxiv.org/abs/2411.15927v1
- Date: Sun, 24 Nov 2024 17:32:20 GMT
- Title: Generative Context Distillation
- Authors: Haebin Shin, Lei Ji, Yeyun Gong, Sungdong Kim, Eunbi Choi, Minjoon Seo,
- Abstract summary: Generative Context Distillation (GCD) is a lightweight prompt internalization method that employs a joint training approach.
We demonstrate that our approach effectively internalizes complex prompts across various agent-based application scenarios.
- Score: 48.91617280112579
- License:
- Abstract: Prompts used in recent large language model based applications are often fixed and lengthy, leading to significant computational overhead. To address this challenge, we propose Generative Context Distillation (GCD), a lightweight prompt internalization method that employs a joint training approach. This method not only replicates the behavior of models with prompt inputs but also generates the content of the prompt along with reasons for why the model's behavior should change accordingly. We demonstrate that our approach effectively internalizes complex prompts across various agent-based application scenarios. For effective training without interactions with the dedicated environments, we introduce a data synthesis technique that autonomously collects conversational datasets by swapping the roles of the agent and environment. This method is especially useful in scenarios where only a predefined prompt is available without a corresponding training dataset. By internalizing complex prompts, Generative Context Distillation enables high-performance and efficient inference without the need for explicit prompts.
Related papers
- Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance.
We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z) - Enabling Real-Time Conversations with Minimal Training Costs [61.80370154101649]
This paper presents a new duplex decoding approach that enhances large language models with duplex ability, requiring minimal training.
Experimental results indicate that our proposed method significantly enhances the naturalness and human-likeness of user-AI interactions with minimal training costs.
arXiv Detail & Related papers (2024-09-18T06:27:26Z) - Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests [6.33281463741573]
Indirect User Requests (IURs) are common in human-human task-oriented dialogue and require world knowledge and pragmatic reasoning from the listener.
While large language models (LLMs) can handle these requests effectively, smaller models deployed on virtual assistants often struggle due to resource constraints.
arXiv Detail & Related papers (2024-06-12T01:18:04Z) - Integrating LLMs and Decision Transformers for Language Grounded
Generative Quality-Diversity [0.0]
Quality-Diversity is a branch of optimization that is often applied to problems from the Reinforcement Learning and control domains.
We propose a Large Language Model to augment the repertoire with natural language descriptions of trajectories.
We also propose an LLM-based approach to evaluating the performance of such generative agents.
arXiv Detail & Related papers (2023-08-25T10:00:06Z) - Leveraging Explicit Procedural Instructions for Data-Efficient Action
Prediction [5.448684866061922]
Task-oriented dialogues often require agents to enact complex, multi-step procedures in order to meet user requests.
Large language models have found success automating these dialogues in constrained environments, but their widespread deployment is limited by the substantial quantities of task-specific data required for training.
This paper presents a data-efficient solution to constructing dialogue systems, leveraging explicit instructions derived from agent guidelines.
arXiv Detail & Related papers (2023-06-06T18:42:08Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - Generative Prompt Tuning for Relation Classification [21.027631157115135]
We propose a novel generative prompt tuning method to reformulate relation classification as an infilling problem.
In addition, we design entity-guided decoding and discriminative relation scoring to generate and align relations effectively and efficiently during inference.
arXiv Detail & Related papers (2022-10-22T12:40:23Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - IDPG: An Instance-Dependent Prompt Generation Method [58.45110542003139]
Prompt tuning is a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage.
We propose a conditional prompt generation method to generate prompts for each input instance.
arXiv Detail & Related papers (2022-04-09T15:45:27Z) - LAVA: Latent Action Spaces via Variational Auto-encoding for Dialogue
Policy Optimization [2.78632567955797]
Reinforcement learning can enable task-oriented dialogue systems to steer the conversation towards successful task completion.
In an end-to-end setting, a response can be constructed in a word-level sequential decision making process with the entire system vocabulary as action space.
Current approaches use an uninformed prior for training and optimize the latent distribution solely on the context.
It is therefore unclear whether the latent representation truly encodes the characteristics of different actions.
arXiv Detail & Related papers (2020-11-18T16:23:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.