Neighbors Are Not Strangers: Improving Non-Autoregressive Translation
under Low-Frequency Lexical Constraints
- URL: http://arxiv.org/abs/2204.13355v1
- Date: Thu, 28 Apr 2022 08:57:47 GMT
- Title: Neighbors Are Not Strangers: Improving Non-Autoregressive Translation
under Low-Frequency Lexical Constraints
- Authors: Chun Zeng, Jiangjie Chen, Tianyi Zhuang, Rui Xu, Hao Yang, Ying Qin,
Shimin Tao, Yanghua Xiao
- Abstract summary: We focus on non-autoregressive translation (NAT) for this problem for its efficiency advantage.
We identify that current constrained NAT models, which are based on iterative editing, do not handle low-frequency constraints well.
We propose a plug-in algorithm for this line of work, i.e., Aligned Constrained Training (ACT), which alleviates this problem by familiarizing the model with the source-side context.
- Score: 33.74298014783385
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: However, current autoregressive approaches suffer from high latency. In this
paper, we focus on non-autoregressive translation (NAT) for this problem for
its efficiency advantage. We identify that current constrained NAT models,
which are based on iterative editing, do not handle low-frequency constraints
well. To this end, we propose a plug-in algorithm for this line of work, i.e.,
Aligned Constrained Training (ACT), which alleviates this problem by
familiarizing the model with the source-side context of the constraints.
Experiments on the general and domain datasets show that our model improves
over the backbone constrained NAT model in constraint preservation and
translation quality, especially for rare constraints.
Related papers
- Intertwining CP and NLP: The Generation of Unreasonably Constrained Sentences [49.86129209397701]
An approach for generating constrained sentences in CP has been proposed in (Bonlarron et al, 2023)
This paper introduces a novel more generic approach to tackle many of these previously untractable problems.
Thanks to CP-based approach, strongly constrained sentences have been successfully generated.
arXiv Detail & Related papers (2024-06-15T17:40:49Z) - Achieving Constraints in Neural Networks: A Stochastic Augmented
Lagrangian Approach [49.1574468325115]
Regularizing Deep Neural Networks (DNNs) is essential for improving generalizability and preventing overfitting.
We propose a novel approach to DNN regularization by framing the training process as a constrained optimization problem.
We employ the Augmented Lagrangian (SAL) method to achieve a more flexible and efficient regularization mechanism.
arXiv Detail & Related papers (2023-10-25T13:55:35Z) - Optimizing Non-Autoregressive Transformers with Contrastive Learning [74.46714706658517]
Non-autoregressive Transformers (NATs) reduce the inference latency of Autoregressive Transformers (ATs) by predicting words all at once rather than in sequential order.
In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.
arXiv Detail & Related papers (2023-05-23T04:20:13Z) - Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs)
GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations.
We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z) - Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive
Machine Translation [18.205288788056787]
Non-autoregressive translation (NAT) reduces the decoding latency but suffers from performance degradation due to the multi-modality problem.
In this paper, we hold the view that all paths in the graph are fuzzily aligned with the reference sentence.
We do not require the exact alignment but train the model to maximize a fuzzy alignment score between the graph and reference, which takes translations captured in all modalities into account.
arXiv Detail & Related papers (2023-03-12T13:51:38Z) - Symmetric Tensor Networks for Generative Modeling and Constrained
Combinatorial Optimization [72.41480594026815]
Constrained optimization problems abound in industry, from portfolio optimization to logistics.
One of the major roadblocks in solving these problems is the presence of non-trivial hard constraints which limit the valid search space.
In this work, we encode arbitrary integer-valued equality constraints of the form Ax=b, directly into U(1) symmetric networks (TNs) and leverage their applicability as quantum-inspired generative models.
arXiv Detail & Related papers (2022-11-16T18:59:54Z) - Modeling Coverage for Non-Autoregressive Neural Machine Translation [9.173385214565451]
We propose a novel Coverage-NAT to model the coverage information directly by a token-level coverage iterative refinement mechanism and a sentence-level coverage agreement.
Experimental results on WMT14 En-De and WMT16 En-Ro translation tasks show that our method can alleviate those errors and achieve strong improvements over the baseline system.
arXiv Detail & Related papers (2021-04-24T07:33:23Z) - Understanding and Improving Lexical Choice in Non-Autoregressive
Translation [98.11249019844281]
We propose to expose the raw data to NAT models to restore the useful information of low-frequency words.
Our approach pushes the SOTA NAT performance on the WMT14 English-German and WMT16 Romanian-English datasets up to 27.8 and 33.8 BLEU points, respectively.
arXiv Detail & Related papers (2020-12-29T03:18:50Z) - Lexically Constrained Neural Machine Translation with Levenshtein
Transformer [8.831954614241234]
This paper proposes a simple and effective algorithm for incorporating lexical constraints in neural machine translation.
Our method injects terminology constraints at inference time without any impact on decoding speed.
arXiv Detail & Related papers (2020-04-27T09:59:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.