Target Conditioning for One-to-Many Generation
- URL: http://arxiv.org/abs/2009.09758v1
- Date: Mon, 21 Sep 2020 11:01:14 GMT
- Title: Target Conditioning for One-to-Many Generation
- Authors: Marie-Anne Lachaux, Armand Joulin, Guillaume Lample
- Abstract summary: We propose to explicitly model this one-to-many mapping by conditioning the decoder of a NMT model on a latent variable that represents the domain of target sentences.
At inference, we can generate diverse translations by decoding with different domains.
We assess the quality and diversity of translations generated by our model with several metrics, on three different datasets.
- Score: 30.402378832810697
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Machine Translation (NMT) models often lack diversity in their
generated translations, even when paired with search algorithm, like beam
search. A challenge is that the diversity in translations are caused by the
variability in the target language, and cannot be inferred from the source
sentence alone. In this paper, we propose to explicitly model this one-to-many
mapping by conditioning the decoder of a NMT model on a latent variable that
represents the domain of target sentences. The domain is a discrete variable
generated by a target encoder that is jointly trained with the NMT model. The
predicted domain of target sentences are given as input to the decoder during
training. At inference, we can generate diverse translations by decoding with
different domains. Unlike our strongest baseline (Shen et al., 2019), our
method can scale to any number of domains without affecting the performance or
the training time. We assess the quality and diversity of translations
generated by our model with several metrics, on three different datasets.
Related papers
- Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - VarMAE: Pre-training of Variational Masked Autoencoder for
Domain-adaptive Language Understanding [5.1282202633907]
We propose a novel Transformer-based language model named VarMAE for domain-adaptive language understanding.
Under the masked autoencoding objective, we design a context uncertainty learning module to encode the token's context into a smooth latent distribution.
Experiments on science- and finance-domain NLU tasks demonstrate that VarMAE can be efficiently adapted to new domains with limited resources.
arXiv Detail & Related papers (2022-11-01T12:51:51Z) - QAGAN: Adversarial Approach To Learning Domain Invariant Language
Features [0.76146285961466]
We explore adversarial training approach towards learning domain-invariant features.
We are able to achieve $15.2%$ improvement in EM score and $5.6%$ boost in F1 score on out-of-domain validation dataset.
arXiv Detail & Related papers (2022-06-24T17:42:18Z) - Beyond Deterministic Translation for Unsupervised Domain Adaptation [19.358300726820943]
In this work we challenge the common approach of using a one-to-one mapping (''translation'') between the source and target domains in unsupervised domain adaptation (UDA)
Instead, we rely on translation to capture inherent ambiguities between the source and target domains.
We report improvements over strong recent baselines, leading to state-of-the-art UDA results on two challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-02-15T23:03:33Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2
Network [73.5062435623908]
We propose a new I2I translation method that generates a new model in the target domain via a series of model transformations.
By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain.
arXiv Detail & Related papers (2020-10-12T13:51:40Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.