Meta-Tuning LLMs to Leverage Lexical Knowledge for Generalizable Language Style Understanding
- URL: http://arxiv.org/abs/2305.14592v2
- Date: Thu, 6 Jun 2024 03:20:45 GMT
- Title: Meta-Tuning LLMs to Leverage Lexical Knowledge for Generalizable Language Style Understanding
- Authors: Ruohao Guo, Wei Xu, Alan Ritter,
- Abstract summary: We show that current large language models struggle to capture some language styles without fine-tuning.
We investigate whether LLMs can be meta-trained based on representative lexicons to recognize new styles they have not been fine-tuned on.
- Score: 24.355564722047244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language style is often used by writers to convey their intentions, identities, and mastery of language. In this paper, we show that current large language models struggle to capture some language styles without fine-tuning. To address this challenge, we investigate whether LLMs can be meta-trained based on representative lexicons to recognize new styles they have not been fine-tuned on. Experiments on 13 established style classification tasks, as well as 63 novel tasks generated using LLMs, demonstrate that meta-training with style lexicons consistently improves zero-shot transfer across styles. We release the code and data at http://github.com/octaviaguo/Style-LLM .
Related papers
- Leveraging the Power of MLLMs for Gloss-Free Sign Language Translation [6.688680877428467]
We propose a novel gloss-free Multimodal Sign Language Translation framework.
We generate detailed textual descriptions of sign language components using multimodal large language models.
Our approach achieves state-of-the-art performance on benchmark datasets PHOENIX14T and CSL-Daily.
arXiv Detail & Related papers (2024-11-25T09:01:41Z) - Using Prompts to Guide Large Language Models in Imitating a Real Person's Language Style [8.653992214883726]
This study compares the language style imitation ability of three different large language models under the guidance of the same zero-shot prompt.
It also involves comparing the imitation ability of the same large language model when guided by three different prompts individually.
By applying a Tree-of-Thoughts (ToT) Prompting method to Llama 3, a conversational AI with the language style of a real person was created.
arXiv Detail & Related papers (2024-10-04T18:30:34Z) - CUTE: Measuring LLMs' Understanding of Their Tokens [54.70665106141121]
Large Language Models (LLMs) show remarkable performance on a wide variety of tasks.
This raises the question: To what extent can LLMs learn orthographic information?
We propose a new benchmark, which features a collection of tasks designed to test the orthographic knowledge of LLMs.
arXiv Detail & Related papers (2024-09-23T18:27:03Z) - Customizing Large Language Model Generation Style using Parameter-Efficient Finetuning [24.263699489328427]
One-size-fits-all large language models (LLMs) are increasingly being used to help people with their writing.
This paper explores whether parameter-efficient finetuning (PEFT) with Low-Rank Adaptation can effectively guide the style of LLM generations.
arXiv Detail & Related papers (2024-09-06T19:25:18Z) - Exploring the Role of Transliteration in In-Context Learning for Low-resource Languages Written in Non-Latin Scripts [50.40191599304911]
We investigate whether transliteration is also effective in improving LLMs' performance for low-resource languages written in non-Latin scripts.
We propose three prompt templates, where the target-language text is represented in (1) its original script, (2) Latin script, or (3) both.
Our findings show that the effectiveness of transliteration varies by task type and model size.
arXiv Detail & Related papers (2024-07-02T14:51:20Z) - Learning to Prompt with Text Only Supervision for Vision-Language Models [107.282881515667]
One branch of methods adapts CLIP by learning prompts using visual information.
An alternative approach resorts to training-free methods by generating class descriptions from large language models.
We propose to combine the strengths of both streams by learning prompts using only text data.
arXiv Detail & Related papers (2024-01-04T18:59:49Z) - ICL Markup: Structuring In-Context Learning using Soft-Token Tags [8.211752085441923]
Large pretrained language models (LLMs) can be rapidly adapted to a wide variety of tasks via a text-to-text approach.
Inspired by markup languages like HTML, we contribute a method of using soft-token tags to compose prompt templates.
Our method is a form of meta-learning for ICL; it learns these tags in advance during a parameter-efficient fine-tuning warm-up'' process.
arXiv Detail & Related papers (2023-12-12T16:25:05Z) - The Ups and Downs of Large Language Model Inference with Vocabulary Trimming by Language Heuristics [74.99898531299148]
This research examines vocabulary trimming (VT) inspired by restricting embedding entries to the language of interest to bolster time and memory efficiency.
We apply two languages to trim the full vocabulary - Unicode-based script filtering and corpus-based selection - to different language families and sizes.
It is found that VT reduces the memory usage of small models by nearly 50% and has an upper bound of 25% improvement in generation speed.
arXiv Detail & Related papers (2023-11-16T09:35:50Z) - InstructAlign: High-and-Low Resource Language Alignment via Continual
Crosslingual Instruction Tuning [66.31509106146605]
Large language models (LLMs) that are tuned with instructions have demonstrated remarkable capabilities in various tasks and languages.
However, their ability to generalize to underrepresented languages is limited due to the scarcity of available data.
We propose InstructAlign which uses continual crosslingual instruction tuning to enable LLMs to align new unseen languages with previously learned high-resource languages.
arXiv Detail & Related papers (2023-05-23T02:51:34Z) - Word Embeddings Are Steers for Language Models [57.83026781380927]
We name such steers LM-Steers and find them existing in LMs of all sizes.
On tasks such as language model detoxification and sentiment control, LM-Steers can achieve comparable or superior performance.
An LM-Steer is transferrable between different language models by an explicit form calculation.
arXiv Detail & Related papers (2023-05-22T07:52:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.