Extensible Prompts for Language Models on Zero-shot Language Style
Customization
- URL: http://arxiv.org/abs/2212.00616v2
- Date: Thu, 30 Nov 2023 20:11:14 GMT
- Title: Extensible Prompts for Language Models on Zero-shot Language Style
Customization
- Authors: Tao Ge, Jing Hu, Li Dong, Shaoguang Mao, Yan Xia, Xun Wang, Si-Qing
Chen, Furu Wei
- Abstract summary: X-Prompt instructs a large language model (LLM) beyond natural language (NL)
registering new imaginary words allows us to instruct the LLM to comprehend concepts that are difficult to describe with NL words.
These imaginary words are designed to be out-of-distribution robust so that they can be (re)used like NL words in various prompts.
- Score: 89.1622516945109
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose eXtensible Prompt (X-Prompt) for prompting a large language model
(LLM) beyond natural language (NL). X-Prompt instructs an LLM with not only NL
but also an extensible vocabulary of imaginary words. Registering new imaginary
words allows us to instruct the LLM to comprehend concepts that are difficult
to describe with NL words, thereby making a prompt more descriptive. Also,
these imaginary words are designed to be out-of-distribution (OOD) robust so
that they can be (re)used like NL words in various prompts, distinguishing
X-Prompt from soft prompt that is for fitting in-distribution data. We propose
context-augmented learning (CAL) to learn imaginary words for general
usability, enabling them to work properly in OOD (unseen) prompts. We
experiment X-Prompt for zero-shot language style customization as a case study.
The promising results of X-Prompt demonstrate its potential to facilitate
advanced interaction beyond the natural language interface, bridging the
communication gap between humans and LLMs.
Related papers
- ExpressivityArena: Can LLMs Express Information Implicitly? [5.93216512770653]
Large Language Models (LLMs) have demonstrated remarkable performance in certain dimensions.
Their ability to express implicit language cues that human use for effective communication remains unclear.
This paper presents ExpressivityArena, a Python library for measuring the implicit communication abilities of LLMs.
arXiv Detail & Related papers (2024-11-12T18:35:28Z) - Understanding and Mitigating Language Confusion in LLMs [76.96033035093204]
We evaluate 15 typologically diverse languages with existing and newly-created English and multilingual prompts.
We find that Llama Instruct and Mistral models exhibit high degrees of language confusion.
We find that language confusion can be partially mitigated via few-shot prompting, multilingual SFT and preference tuning.
arXiv Detail & Related papers (2024-06-28T17:03:51Z) - Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability.
The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts.
As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z) - AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations [52.43593893122206]
Alignedcot is an in-context learning technique for invoking Large Language Models.
It achieves consistent and correct step-wise prompts in zero-shot scenarios.
We conduct experiments on mathematical reasoning and commonsense reasoning.
arXiv Detail & Related papers (2023-11-22T17:24:21Z) - The language of prompting: What linguistic properties make a prompt
successful? [13.034603322224548]
LLMs can be prompted to achieve impressive zero-shot or few-shot performance in many NLP tasks.
Yet, we still lack a systematic understanding of how linguistic properties of prompts correlate with task performance.
We investigate both grammatical properties such as mood, tense, aspect and modality, as well as lexico-semantic variation through the use of synonyms.
arXiv Detail & Related papers (2023-11-03T15:03:36Z) - Establishing Vocabulary Tests as a Benchmark for Evaluating Large
Language Models [2.7013338932521416]
We advocate for the revival of vocabulary tests as a valuable tool for assessing Large Language Models (LLMs) performance.
We evaluate seven LLMs using two vocabulary test formats across two languages and uncover surprising gaps in their lexical knowledge.
arXiv Detail & Related papers (2023-10-23T08:45:12Z) - Translate to Disambiguate: Zero-shot Multilingual Word Sense
Disambiguation with Pretrained Language Models [67.19567060894563]
Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks.
We present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT)
We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance.
arXiv Detail & Related papers (2023-04-26T19:55:52Z) - Revisiting Language Encoding in Learning Multilingual Representations [70.01772581545103]
We propose a new approach called Cross-lingual Language Projection (XLP) to replace language embedding.
XLP projects the word embeddings into language-specific semantic space, and then the projected embeddings will be fed into the Transformer model.
Experiments show that XLP can freely and significantly boost the model performance on extensive multilingual benchmark datasets.
arXiv Detail & Related papers (2021-02-16T18:47:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.