Towards Multimodal Simultaneous Neural Machine Translation
- URL: http://arxiv.org/abs/2004.03180v2
- Date: Fri, 23 Oct 2020 04:38:38 GMT
- Title: Towards Multimodal Simultaneous Neural Machine Translation
- Authors: Aizhan Imankulova, Masahiro Kaneko, Tosho Hirasawa and Mamoru Komachi
- Abstract summary: Simultaneous translation involves translating a sentence before the speaker's utterance is completed in order to realize real-time understanding.
This task is significantly more challenging than the general full sentence translation because of the shortage of input information during decoding.
We propose multimodal simultaneous neural machine translation (MSNMT), which leverages visual information as an additional modality.
- Score: 28.536262015508722
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Simultaneous translation involves translating a sentence before the speaker's
utterance is completed in order to realize real-time understanding in multiple
languages. This task is significantly more challenging than the general full
sentence translation because of the shortage of input information during
decoding. To alleviate this shortage, we propose multimodal simultaneous neural
machine translation (MSNMT), which leverages visual information as an
additional modality. Our experiments with the Multi30k dataset showed that
MSNMT significantly outperforms its text-only counterpart in more timely
translation situations with low latency. Furthermore, we verified the
importance of visual information during decoding by performing an adversarial
evaluation of MSNMT, where we studied how models behaved with incongruent input
modality and analyzed the effect of different word order between source and
target languages.
Related papers
- TMT: Tri-Modal Translation between Speech, Image, and Text by Processing
Different Modalities as Different Languages [96.8603701943286]
Tri-Modal Translation (TMT) model translates between arbitrary modalities spanning speech, image, and text.
We tokenize speech and image data into discrete tokens, which provide a unified interface across modalities.
TMT outperforms single model counterparts consistently.
arXiv Detail & Related papers (2024-02-25T07:46:57Z) - Extending Multilingual Machine Translation through Imitation Learning [60.15671816513614]
Imit-MNMT treats the task as an imitation learning process, which mimicks the behavior of an expert.
We show that our approach significantly improves the translation performance between the new and the original languages.
We also demonstrate that our approach is capable of solving copy and off-target problems.
arXiv Detail & Related papers (2023-11-14T21:04:03Z) - Understanding and Bridging the Modality Gap for Speech Translation [11.13240570688547]
Multi-task learning is one of the effective ways to share knowledge between machine translation (MT) and end-to-end speech translation (ST)
However, due to the differences between speech and text, there is always a gap between ST and MT.
In this paper, we first aim to understand this modality gap from the target-side representation differences, and link the modality gap to another well-known problem in neural machine translation: exposure bias.
arXiv Detail & Related papers (2023-05-15T15:09:18Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - DivEMT: Neural Machine Translation Post-Editing Effort Across
Typologically Diverse Languages [5.367993194110256]
DivEMT is the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages.
We assess the impact on translation productivity of two state-of-the-art NMT systems, namely: Google Translate and the open-source multilingual model mBART50.
arXiv Detail & Related papers (2022-05-24T17:22:52Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Simultaneous Machine Translation with Visual Context [42.88121241096681]
Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.
We analyse the impact of different multimodal approaches and visual features on state-of-the-art SiMT frameworks.
arXiv Detail & Related papers (2020-09-15T18:19:11Z) - Unsupervised Multimodal Neural Machine Translation with Pseudo Visual
Pivoting [105.5303416210736]
Unsupervised machine translation (MT) has recently achieved impressive results with monolingual corpora only.
It is still challenging to associate source-target sentences in the latent space.
As people speak different languages biologically share similar visual systems, the potential of achieving better alignment through visual content is promising.
arXiv Detail & Related papers (2020-05-06T20:11:46Z) - It's Easier to Translate out of English than into it: Measuring Neural
Translation Difficulty by Cross-Mutual Information [90.35685796083563]
Cross-mutual information (XMI) is an asymmetric information-theoretic metric of machine translation difficulty.
XMI exploits the probabilistic nature of most neural machine translation models.
We present the first systematic and controlled study of cross-lingual translation difficulties using modern neural translation systems.
arXiv Detail & Related papers (2020-05-05T17:38:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.