ICL Markup: Structuring In-Context Learning using Soft-Token Tags
- URL: http://arxiv.org/abs/2312.07405v1
- Date: Tue, 12 Dec 2023 16:25:05 GMT
- Title: ICL Markup: Structuring In-Context Learning using Soft-Token Tags
- Authors: Marc-Etienne Brunet, Ashton Anderson, Richard Zemel
- Abstract summary: Large pretrained language models (LLMs) can be rapidly adapted to a wide variety of tasks via a text-to-text approach.
Inspired by markup languages like HTML, we contribute a method of using soft-token tags to compose prompt templates.
Our method is a form of meta-learning for ICL; it learns these tags in advance during a parameter-efficient fine-tuning warm-up'' process.
- Score: 8.211752085441923
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large pretrained language models (LLMs) can be rapidly adapted to a wide
variety of tasks via a text-to-text approach, where the instruction and input
are fed to the model in natural language. Combined with in-context learning
(ICL), this paradigm is impressively flexible and powerful. However, it also
burdens users with an overwhelming number of choices, many of them arbitrary.
Inspired by markup languages like HTML, we contribute a method of using
soft-token tags to compose prompt templates. This approach reduces arbitrary
decisions and streamlines the application of ICL. Our method is a form of
meta-learning for ICL; it learns these tags in advance during a
parameter-efficient fine-tuning ``warm-up'' process. The tags can subsequently
be used in templates for ICL on new, unseen tasks without any additional
fine-tuning. Our experiments with this approach yield promising initial
results, improving LLM performance on important enterprise applications such as
few-shot and open-world intent detection, as well as text classification in
news and legal domains.
Related papers
- Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods [69.36397993451742]
This work introduces Context-aware Prompt Tuning (CPT), a method inspired by ICL, PT, and adversarial attacks.
We modify specific context tokens, considering the unique structure of input and output formats.
Inspired by adversarial attacks, we adjust the input based on the labels present in the context, focusing on minimizing, rather than maximizing, the loss.
arXiv Detail & Related papers (2024-10-22T17:45:47Z) - Parameter-Efficient Fine-Tuning of Large Language Models using Semantic Knowledge Tuning [0.08795040582681389]
Large Language Models (LLMs) are gaining significant popularity in recent years for specialized tasks using prompts.
We propose a novel method called Semantic Knowledge Tuning (SK-Tuning) for prompt and prefix tuning that employs meaningful words instead of random tokens.
Our experimental results show that SK-Tuning exhibits faster training times, fewer parameters, and superior performance on tasks such as text classification and understanding.
arXiv Detail & Related papers (2024-10-11T07:55:09Z) - Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data [84.01401439030265]
Recent end-to-end speech language models (SLMs) have expanded upon the capabilities of large language models (LLMs)
We present a simple yet effective automatic process for creating speech-text pair data.
Our model demonstrates general capabilities for speech-related tasks without the need for speech instruction-tuning data.
arXiv Detail & Related papers (2024-09-30T07:01:21Z) - Optimising Hard Prompts with Few-Shot Meta-Prompting [0.0]
Contextual prompts include context in the form of a document or dialogue along with the natural language instructions to the Large Language Model (LLM)
Masking the context, it acts as template for prompts.
In this paper, we present an iterative method to generate better templates using an LLM from an existing set of prompt templates without revealing the context to the LLM.
arXiv Detail & Related papers (2024-07-09T07:02:57Z) - Soft Prompting for Unlearning in Large Language Models [11.504012974208466]
This work focuses on investigating machine unlearning for Large Language Models motivated by data protection regulations.
We propose a framework textbfSoft textbfPrompting for textbfUntextbflearning (SPUL)
We conduct a rigorous evaluation of the proposed method and our results indicate that SPUL can significantly improve the trade-off between utility and forgetting.
arXiv Detail & Related papers (2024-06-17T19:11:40Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)
We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - Identifying and Analyzing Task-Encoding Tokens in Large Language Models [55.03191279766383]
In this paper, we identify and analyze task-encoding tokens on whose representations the task performance depends.
We show that template and stopword tokens are the most prone to be task-encoding.
Our work sheds light on how large language models (LLMs) learn to perform a task from demonstrations, deepens our understanding of the varied roles different types of tokens play in LLMs, and provides insights for avoiding instability from improperly utilizing task-encoding tokens.
arXiv Detail & Related papers (2024-01-20T20:55:21Z) - L-TUNING: Synchronized Label Tuning for Prompt and Prefix in LLMs [0.0]
This paper introduces L-Tuning, an efficient fine-tuning approach for classification tasks within the Natural Language Inference (NLI) framework.
L-Tuning focuses on the fine-tuning of label tokens processed through a pre-trained Large Language Models (LLMs)
Our experimental results indicate a significant improvement in training efficiency and classification accuracy with L-Tuning compared to traditional approaches.
arXiv Detail & Related papers (2023-12-21T01:47:49Z) - kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest
Neighbor In-Context Learning [50.40636157214161]
Task-Oriented Parsing (TOP) enables conversational assistants to interpret user commands expressed in natural language.
LLMs have achieved impressive performance in computer programs based on a natural language prompt.
This paper focuses on harnessing the capabilities of LLMs for semantic parsing tasks.
arXiv Detail & Related papers (2023-12-17T17:26:50Z) - LabelPrompt: Effective Prompt-based Learning for Relation Classification [31.291466190218912]
This paper presents a novel prompt-based learning method, namely LabelPrompt, for the relation classification task.
Motivated by the intuition to GIVE MODEL CHOICES!'', we first define additional tokens to represent relation labels, which regard these tokens as the verbaliser with semantic initialisation.
Then, to mitigate inconsistency between predicted relations and given entities, we implement an entity-aware module with contrastive learning.
arXiv Detail & Related papers (2023-02-16T04:06:25Z) - Prompting Language Models for Linguistic Structure [73.11488464916668]
We present a structured prompting approach for linguistic structured prediction tasks.
We evaluate this approach on part-of-speech tagging, named entity recognition, and sentence chunking.
We find that while PLMs contain significant prior knowledge of task labels due to task leakage into the pretraining corpus, structured prompting can also retrieve linguistic structure with arbitrary labels.
arXiv Detail & Related papers (2022-11-15T01:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.