On Learning Text Style Transfer with Direct Rewards
- URL: http://arxiv.org/abs/2010.12771v2
- Date: Thu, 13 May 2021 15:00:38 GMT
- Title: On Learning Text Style Transfer with Direct Rewards
- Authors: Yixin Liu, Graham Neubig, John Wieting
- Abstract summary: Lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task.
We leverage semantic similarity metrics originally used for fine-tuning neural machine translation models.
Our model provides significant gains in both automatic and human evaluation over strong baselines.
- Score: 101.97136885111037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In most cases, the lack of parallel corpora makes it impossible to directly
train supervised models for the text style transfer task. In this paper, we
explore training algorithms that instead optimize reward functions that
explicitly consider different aspects of the style-transferred outputs. In
particular, we leverage semantic similarity metrics originally used for
fine-tuning neural machine translation models to explicitly assess the
preservation of content between system outputs and input texts. We also
investigate the potential weaknesses of the existing automatic metrics and
propose efficient strategies of using these metrics for training. The
experimental results show that our model provides significant gains in both
automatic and human evaluation over strong baselines, indicating the
effectiveness of our proposed methods and training strategies.
Related papers
- Mind the Gap: A Generalized Approach for Cross-Modal Embedding Alignment [0.0]
Retrieval-Augmented Generation (RAG) systems retrieve context across different text modalities due to semantic gaps.
We introduce a generalized projection-based method, inspired by adapter modules in transfer learning, that efficiently bridges these gaps.
Our approach emphasizes speed, accuracy, and data efficiency, requiring minimal resources for training and inference.
arXiv Detail & Related papers (2024-10-30T20:28:10Z) - Adjusting Pretrained Backbones for Performativity [34.390793811659556]
We propose a novel technique to adjust pretrained backbones for performativity in a modular way.
We show how it leads to smaller loss along the retraining trajectory and enables us to effectively select among candidate models to anticipate performance degradations.
arXiv Detail & Related papers (2024-10-06T14:41:13Z) - Style Transfer with Multi-iteration Preference Optimization [27.5647739554034]
We consider the relationship between reinforcement learning and preference optimization.
Inspired by these techniques from the past, we improve upon established preference optimization approaches.
We evaluate our model on two commonly used text style transfer datasets.
arXiv Detail & Related papers (2024-06-17T14:20:53Z) - DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning [75.68193159293425]
In-context learning (ICL) allows transformer-based language models to learn a specific task with a few "task demonstrations" without updating their parameters.
We propose an influence function-based attribution technique, DETAIL, that addresses the specific characteristics of ICL.
We experimentally prove the wide applicability of DETAIL by showing our attribution scores obtained on white-box models are transferable to black-box models in improving model performance.
arXiv Detail & Related papers (2024-05-22T15:52:52Z) - Unsupervised 3D registration through optimization-guided cyclical
self-training [71.75057371518093]
State-of-the-art deep learning-based registration methods employ three different learning strategies.
We propose a novel self-supervised learning paradigm for unsupervised registration, relying on self-training.
We evaluate the method for abdomen and lung registration, consistently surpassing metric-based supervision and outperforming diverse state-of-the-art competitors.
arXiv Detail & Related papers (2023-06-29T14:54:10Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Empirical Evaluation of Supervision Signals for Style Transfer Models [44.39622949370144]
In this work we empirically compare the dominant optimization paradigms which provide supervision signals during training.
We find that backtranslation has model-specific limitations, which inhibits training style transfer models.
We also experiment with Minimum Risk Training, a popular technique in the machine translation community, which, to our knowledge, has not been empirically evaluated in the task of style transfer.
arXiv Detail & Related papers (2021-01-15T15:33:30Z) - Guiding Attention for Self-Supervised Learning with Transformers [24.785500242464646]
We propose a technique to allow for efficient self-supervised learning with bi-directional Transformers.
Our approach is motivated by recent studies demonstrating that self-attention patterns in trained models contain a majority of non-linguistic regularities.
arXiv Detail & Related papers (2020-10-06T00:04:08Z) - A Simple but Tough-to-Beat Data Augmentation Approach for Natural
Language Understanding and Generation [53.8171136907856]
We introduce a set of simple yet effective data augmentation strategies dubbed cutoff.
cutoff relies on sampling consistency and thus adds little computational overhead.
cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.
arXiv Detail & Related papers (2020-09-29T07:08:35Z) - Dynamic Data Selection and Weighting for Iterative Back-Translation [116.14378571769045]
We propose a curriculum learning strategy for iterative back-translation models.
We evaluate our models on domain adaptation, low-resource, and high-resource MT settings.
Experimental results demonstrate that our methods achieve improvements of up to 1.8 BLEU points over competitive baselines.
arXiv Detail & Related papers (2020-04-07T19:49:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.