Semi-Supervised Formality Style Transfer with Consistency Training
- URL: http://arxiv.org/abs/2203.13620v1
- Date: Fri, 25 Mar 2022 12:40:36 GMT
- Title: Semi-Supervised Formality Style Transfer with Consistency Training
- Authors: Ao Liu, An Wang, Naoaki Okazaki
- Abstract summary: We propose a semi-supervised framework to better utilize source-side unlabeled sentences.
Specifically, our approach augments pseudo-parallel data obtained from a source-side informal sentence.
Our approach can achieve state-of-the-art results, even with less than 40% of the parallel data.
- Score: 14.837655109835769
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Formality style transfer (FST) is a task that involves paraphrasing an
informal sentence into a formal one without altering its meaning. To address
the data-scarcity problem of existing parallel datasets, previous studies tend
to adopt a cycle-reconstruction scheme to utilize additional unlabeled data,
where the FST model mainly benefits from target-side unlabeled sentences. In
this work, we propose a simple yet effective semi-supervised framework to
better utilize source-side unlabeled sentences based on consistency training.
Specifically, our approach augments pseudo-parallel data obtained from a
source-side informal sentence by enforcing the model to generate similar
outputs for its perturbed version. Moreover, we empirically examined the
effects of various data perturbation methods and propose effective data
filtering strategies to improve our framework. Experimental results on the
GYAFC benchmark demonstrate that our approach can achieve state-of-the-art
results, even with less than 40% of the parallel data.
Related papers
- T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data [0.0]
Self-supervised learning (SSL) generally involves generating different views of the same sample and thus requires data augmentations.
In the present work, we propose a novel augmentation-free SSL method for structured data.
Our approach, T-JEPA, relies on a Joint Embedding Predictive Architecture (JEPA) and is akin to mask reconstruction in the latent space.
arXiv Detail & Related papers (2024-10-07T13:15:07Z) - Efficient Conformal Prediction under Data Heterogeneity [79.35418041861327]
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification.
Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples.
This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions.
arXiv Detail & Related papers (2023-12-25T20:02:51Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z) - A Regularized Implicit Policy for Offline Reinforcement Learning [54.7427227775581]
offline reinforcement learning enables learning from a fixed dataset, without further interactions with the environment.
We propose a framework that supports learning a flexible yet well-regularized fully-implicit policy.
Experiments and ablation study on the D4RL dataset validate our framework and the effectiveness of our algorithmic designs.
arXiv Detail & Related papers (2022-02-19T20:22:04Z) - Dataset Condensation with Contrastive Signals [41.195453119305746]
gradient matching-based dataset synthesis (DC) methods can achieve state-of-the-art performance when applied to data-efficient learning tasks.
In this study, we prove that the existing DC methods can perform worse than the random selection method when task-irrelevant information forms a significant part of the training dataset.
We propose dataset condensation with Contrastive signals (DCC) by modifying the loss function to enable the DC methods to effectively capture the differences between classes.
arXiv Detail & Related papers (2022-02-07T03:05:32Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - A Simple but Tough-to-Beat Data Augmentation Approach for Natural
Language Understanding and Generation [53.8171136907856]
We introduce a set of simple yet effective data augmentation strategies dubbed cutoff.
cutoff relies on sampling consistency and thus adds little computational overhead.
cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.
arXiv Detail & Related papers (2020-09-29T07:08:35Z) - Parallel Data Augmentation for Formality Style Transfer [27.557690344637034]
In this paper, we study how to augment parallel data and propose novel and simple data augmentation methods for this task.
Experiments demonstrate that our augmented parallel data largely helps improve formality style transfer when it is used to pre-train the model.
arXiv Detail & Related papers (2020-05-14T04:05:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.