Encouraging Neural Machine Translation to Satisfy Terminology
Constraints
- URL: http://arxiv.org/abs/2106.03730v1
- Date: Mon, 7 Jun 2021 15:46:07 GMT
- Title: Encouraging Neural Machine Translation to Satisfy Terminology
Constraints
- Authors: Melissa Ailem, Jinghsu Liu, Raheel Qader
- Abstract summary: We present a new approach to encourage neural machine translation to satisfy lexical constraints.
Our method acts at the training step and thereby avoiding the introduction of any extra computational overhead at inference step.
- Score: 3.3108924994485096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a new approach to encourage neural machine translation to satisfy
lexical constraints. Our method acts at the training step and thereby avoiding
the introduction of any extra computational overhead at inference step. The
proposed method combines three main ingredients. The first one consists in
augmenting the training data to specify the constraints. Intuitively, this
encourages the model to learn a copy behavior when it encounters constraint
terms. Compared to previous work, we use a simplified augmentation strategy
without source factors. The second ingredient is constraint token masking,
which makes it even easier for the model to learn the copy behavior and
generalize better. The third one, is a modification of the standard cross
entropy loss to bias the model towards assigning high probabilities to
constraint words. Empirical results show that our method improves upon related
baselines in terms of both BLEU score and the percentage of generated
constraint terms.
Related papers
- Understanding the Double Descent Phenomenon in Deep Learning [49.1574468325115]
This tutorial sets the classical statistical learning framework and introduces the double descent phenomenon.
By looking at a number of examples, section 2 introduces inductive biases that appear to have a key role in double descent by selecting.
section 3 explores the double descent with two linear models, and gives other points of view from recent related works.
arXiv Detail & Related papers (2024-03-15T16:51:24Z) - A Pseudo-Semantic Loss for Autoregressive Models with Logical
Constraints [87.08677547257733]
Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning.
We show how to maximize the likelihood of a symbolic constraint w.r.t the neural network's output distribution.
We also evaluate our approach on Sudoku and shortest-path prediction cast as autoregressive generation.
arXiv Detail & Related papers (2023-12-06T20:58:07Z) - Negative Lexical Constraints in Neural Machine Translation [1.3124513975412255]
Negative lexical constraining is used to prohibit certain words or expressions in the translation produced by the neural translation model.
We compare various methods based on modifying either the decoding process or the training data.
We demonstrate that our method improves the constraining, although the problem still persists in many cases.
arXiv Detail & Related papers (2023-08-07T14:04:15Z) - Confident Adaptive Language Modeling [95.45272377648773]
CALM is a framework for dynamically allocating different amounts of compute per input and generation timestep.
We demonstrate the efficacy of our framework in reducing compute -- potential speedup of up to $times 3$ -- while provably maintaining high performance.
arXiv Detail & Related papers (2022-07-14T17:00:19Z) - Integrated Training for Sequence-to-Sequence Models Using
Non-Autoregressive Transformer [49.897891031932545]
We propose a cascaded model based on the non-autoregressive Transformer that enables end-to-end training without the need for an explicit intermediate representation.
We conduct an evaluation on two pivot-based machine translation tasks, namely French-German and German-Czech.
arXiv Detail & Related papers (2021-09-27T11:04:09Z) - On the Reproducibility of Neural Network Predictions [52.47827424679645]
We study the problem of churn, identify factors that cause it, and propose two simple means of mitigating it.
We first demonstrate that churn is indeed an issue, even for standard image classification tasks.
We propose using emphminimum entropy regularizers to increase prediction confidences.
We present empirical results showing the effectiveness of both techniques in reducing churn while improving the accuracy of the underlying model.
arXiv Detail & Related papers (2021-02-05T18:51:01Z) - A Simple but Tough-to-Beat Data Augmentation Approach for Natural
Language Understanding and Generation [53.8171136907856]
We introduce a set of simple yet effective data augmentation strategies dubbed cutoff.
cutoff relies on sampling consistency and thus adds little computational overhead.
cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.
arXiv Detail & Related papers (2020-09-29T07:08:35Z) - Lexically Constrained Neural Machine Translation with Levenshtein
Transformer [8.831954614241234]
This paper proposes a simple and effective algorithm for incorporating lexical constraints in neural machine translation.
Our method injects terminology constraints at inference time without any impact on decoding speed.
arXiv Detail & Related papers (2020-04-27T09:59:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.