Parallel Data Augmentation for Formality Style Transfer
- URL: http://arxiv.org/abs/2005.07522v1
- Date: Thu, 14 May 2020 04:05:29 GMT
- Title: Parallel Data Augmentation for Formality Style Transfer
- Authors: Yi Zhang, Tao Ge, Xu Sun
- Abstract summary: In this paper, we study how to augment parallel data and propose novel and simple data augmentation methods for this task.
Experiments demonstrate that our augmented parallel data largely helps improve formality style transfer when it is used to pre-train the model.
- Score: 27.557690344637034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The main barrier to progress in the task of Formality Style Transfer is the
inadequacy of training data. In this paper, we study how to augment parallel
data and propose novel and simple data augmentation methods for this task to
obtain useful sentence pairs with easily accessible models and systems.
Experiments demonstrate that our augmented parallel data largely helps improve
formality style transfer when it is used to pre-train the model, leading to the
state-of-the-art results in the GYAFC benchmark dataset.
Related papers
- Latent mixed-effect models for high-dimensional longitudinal data [6.103940626659986]
We propose LMM-VAE, a scalable, interpretable and identifiable model for longitudinal data.
We highlight theoretical connections between it and GP-based techniques, providing a unified framework for this class of methods.
arXiv Detail & Related papers (2024-09-17T09:16:38Z) - Encapsulating Knowledge in One Prompt [56.31088116526825]
KiOP encapsulates knowledge from various models into a solitary prompt without altering the original models or requiring access to the training data.
From a practicality standpoint, this paradigm proves the effectiveness of Visual Prompt in data inaccessible contexts.
Experiments across various datasets and models demonstrate the efficacy of the proposed KiOP knowledge transfer paradigm.
arXiv Detail & Related papers (2024-07-16T16:35:23Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - Non-Parallel Text Style Transfer with Self-Parallel Supervision [19.441780035577352]
We propose LaMer, a novel text style transfer framework based on large-scale language models.
LaMer first mines the roughly parallel expressions in the non-parallel datasets with scene graphs, and then employs MLE training, followed by imitation learning refinement, to leverage the intrinsic parallelism within the data.
On two benchmark tasks (sentiment & formality transfer) and a newly proposed challenging task (political stance transfer), our model achieves qualitative advances in transfer accuracy, content preservation, and fluency.
arXiv Detail & Related papers (2022-04-18T01:38:35Z) - Semi-Supervised Formality Style Transfer with Consistency Training [14.837655109835769]
We propose a semi-supervised framework to better utilize source-side unlabeled sentences.
Specifically, our approach augments pseudo-parallel data obtained from a source-side informal sentence.
Our approach can achieve state-of-the-art results, even with less than 40% of the parallel data.
arXiv Detail & Related papers (2022-03-25T12:40:36Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z) - Regularizing Generative Adversarial Networks under Limited Data [88.57330330305535]
This work proposes a regularization approach for training robust GAN models on limited data.
We show a connection between the regularized loss and an f-divergence called LeCam-divergence, which we find is more robust under limited training data.
arXiv Detail & Related papers (2021-04-07T17:59:06Z) - Semi-supervised Formality Style Transfer using Language Model
Discriminator and Mutual Information Maximization [52.867459839641526]
Formality style transfer is the task of converting informal sentences to grammatically-correct formal sentences.
We propose a semi-supervised formality style transfer model that utilizes a language model-based discriminator to maximize the likelihood of the output sentence being formal.
Experiments showed that our model outperformed previous state-of-the-art baselines significantly in terms of both automated metrics and human judgement.
arXiv Detail & Related papers (2020-10-10T21:05:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.