Distilling Text Style Transfer With Self-Explanation From LLMs
- URL: http://arxiv.org/abs/2403.01106v2
- Date: Sat, 4 May 2024 17:23:21 GMT
- Title: Distilling Text Style Transfer With Self-Explanation From LLMs
- Authors: Chiyu Zhang, Honglong Cai, Yuezhang, Li, Yuexin Wu, Le Hou, Muhammad Abdul-Mageed,
- Abstract summary: Text Style Transfer (TST) seeks to alter the style of text while retaining its core content.
We propose a framework that leverages large language models (LLMs) alongside chain-of-thought (CoT)
Co is shown to surpass traditional supervised fine-tuning and knowledge distillation methods.
- Score: 28.595450029172124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text Style Transfer (TST) seeks to alter the style of text while retaining its core content. Given the constraints of limited parallel datasets for TST, we propose CoTeX, a framework that leverages large language models (LLMs) alongside chain-of-thought (CoT) prompting to facilitate TST. CoTeX distills the complex rewriting and reasoning capabilities of LLMs into more streamlined models capable of working with both non-parallel and parallel data. Through experimentation across four TST datasets, CoTeX is shown to surpass traditional supervised fine-tuning and knowledge distillation methods, particularly in low-resource settings. We conduct a comprehensive evaluation, comparing CoTeX against current unsupervised, supervised, in-context learning (ICL) techniques, and instruction-tuned LLMs. Furthermore, CoTeX distinguishes itself by offering transparent explanations for its style transfer process.
Related papers
- CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought [33.32415197728357]
Speech Language Models (SLMs) have demonstrated impressive performance on speech translation tasks.
We introduce a three-stage training framework designed to activate the chain-of-thought capabilities of SLMs.
We propose CoT-ST, a speech translation model that utilizes multimodal CoT to decompose speech translation into sequential steps of speech recognition and translation.
arXiv Detail & Related papers (2024-09-29T01:48:09Z) - AnyTrans: Translate AnyText in the Image with Large Scale Models [88.5887934499388]
This paper introduces AnyTrans, an all-encompassing framework for the task-Translate AnyText in the Image (TATI)
Our framework incorporates contextual cues from both textual and visual elements during translation.
We have meticulously compiled a test dataset called MTIT6, which consists of multilingual text image translation data from six language pairs.
arXiv Detail & Related papers (2024-06-17T11:37:48Z) - Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction [36.915250638481986]
We introduce LiveSum, a new benchmark dataset for generating summary tables of competitions based on real-time commentary texts.
We evaluate the performances of state-of-the-art Large Language Models on this task in both fine-tuning and zero-shot settings.
We additionally propose a novel pipeline called $T3$(Text-Tuple-Table) to improve their performances.
arXiv Detail & Related papers (2024-04-22T14:31:28Z) - Deja vu: Contrastive Historical Modeling with Prefix-tuning for Temporal Knowledge Graph Reasoning [16.408149489677154]
ChapTER is a Contrastive historical modeling framework with prefix-tuning for TEmporal Reasoning.
We evaluate ChapTER on four transductive and three few-shot inductive TKGR benchmarks.
arXiv Detail & Related papers (2024-03-25T17:25:40Z) - Unsupervised Text Style Transfer via LLMs and Attention Masking with
Multi-way Interactions [18.64326057581588]
Unsupervised Text Style Transfer (UTST) has emerged as a critical task within the domain of Natural Language Processing (NLP)
We propose four ways of interactions, that are pipeline framework with tuned orders; knowledge distillation from Large Language Models (LLMs) to attention masking model; in-context learning with constructed parallel examples.
We empirically show these multi-way interactions can improve the baselines in certain perspective of style strength, content preservation and text fluency.
arXiv Detail & Related papers (2024-02-21T09:28:02Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data [73.29220562541204]
We consider harnessing the amazing power of language models (LLMs) to solve our task.
We develop a TAT-LLM language model by fine-tuning LLaMA 2 with the training data generated automatically from existing expert-annotated datasets.
arXiv Detail & Related papers (2024-01-24T04:28:50Z) - Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard
Parameter Sharing [72.56219471145232]
We propose a ST/MT multi-tasking framework with hard parameter sharing.
Our method reduces the speech-text modality gap via a pre-processing stage.
We show that our framework improves attentional encoder-decoder, Connectionist Temporal Classification (CTC), transducer, and joint CTC/attention models by an average of +0.5 BLEU.
arXiv Detail & Related papers (2023-09-27T17:48:14Z) - Instruction Tuning for Large Language Models: A Survey [52.86322823501338]
This paper surveys research works in the quickly advancing field of instruction tuning (IT)
In this paper, unless specified otherwise, instruction tuning (IT) will be equivalent to supervised fine-tuning (SFT)
arXiv Detail & Related papers (2023-08-21T15:35:16Z) - VAE based Text Style Transfer with Pivot Words Enhancement Learning [5.717913255287939]
We propose a VAE based Text Style Transfer with pivOt Words Enhancement leaRning (VT-STOWER) method.
We introduce pivot words learning, which is applied to learn decisive words for a specific style.
The proposed VT-STOWER can be scaled to different TST scenarios with a novel and flexible style strength control mechanism.
arXiv Detail & Related papers (2021-12-06T16:41:26Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.