Shared Latent Space by Both Languages in Non-Autoregressive Neural
Machine Translation
- URL: http://arxiv.org/abs/2305.03511v1
- Date: Tue, 2 May 2023 15:33:09 GMT
- Title: Shared Latent Space by Both Languages in Non-Autoregressive Neural
Machine Translation
- Authors: DongNyeong Heo and Heeyoul Choi
- Abstract summary: We propose a new latent variable modeling that is based on a dual reconstruction perspective and an advanced hierarchical latent modeling approach.
Our proposed method, em LadderNMT, shares a latent space across both languages so that it hypothetically alleviates or solves the above disadvantages.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Latent variable modeling in non-autoregressive neural machine translation
(NAT) is a promising approach to mitigate the multimodality problem. In the
previous works, they added an auxiliary model to estimate the posterior
distribution of the latent variable conditioned on the source and target
sentences. However, it causes several disadvantages, such as redundant
information extraction in the latent variable, increasing parameters, and a
tendency to ignore a part of the information from the inputs. In this paper, we
propose a new latent variable modeling that is based on a dual reconstruction
perspective and an advanced hierarchical latent modeling approach. Our proposed
method, {\em LadderNMT}, shares a latent space across both languages so that it
hypothetically alleviates or solves the above disadvantages. Experimental
results quantitatively and qualitatively demonstrate that our proposed latent
variable modeling learns an advantageous latent space and significantly
improves translation quality in WMT translation tasks.
Related papers
- Mitigating Data Imbalance and Representation Degeneration in
Multilingual Machine Translation [103.90963418039473]
Bi-ACL is a framework that uses only target-side monolingual data and a bilingual dictionary to improve the performance of the MNMT model.
We show that Bi-ACL is more effective both in long-tail languages and in high-resource languages.
arXiv Detail & Related papers (2023-05-22T07:31:08Z) - Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method.
We develop practical bounds to apply it to language generation.
We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z) - Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z) - Discrete Auto-regressive Variational Attention Models for Text Modeling [53.38382932162732]
Variational autoencoders (VAEs) have been widely applied for text modeling.
They are troubled by two challenges: information underrepresentation and posterior collapse.
We propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges.
arXiv Detail & Related papers (2021-06-16T06:36:26Z) - Variational Neural Machine Translation with Normalizing Flows [13.537869825364718]
Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations.
We propose to apply the VNMT framework to the state-of-the-art Transformer and introduce a more flexible approximate posterior based on normalizing flows.
arXiv Detail & Related papers (2020-05-28T13:30:53Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z) - Discrete Variational Attention Models for Language Generation [51.88612022940496]
We propose a discrete variational attention model with categorical distribution over the attention mechanism owing to the discrete nature in languages.
Thanks to the property of discreteness, the training of our proposed approach does not suffer from posterior collapse.
arXiv Detail & Related papers (2020-04-21T05:49:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.