Related papers: Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features

Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features

URL: http://arxiv.org/abs/2408.01394v1
Date: Fri, 2 Aug 2024 17:10:12 GMT
Title: Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features
Authors: Mengyu Bu, Shuhao Gu, Yang Feng,
Abstract summary: We propose to exploit both semantic and linguistic features between multiple languages to enhance multilingual translation. On the encoder side, we introduce a disentangling learning task that aligns encoder representations by disentangling semantic and linguistic features. On the decoder side, we leverage a linguistic encoder to integrate low-level linguistic features to assist in the target language generation.
Score: 18.76505158652759
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The many-to-many multilingual neural machine translation can be regarded as the process of integrating semantic features from the source sentences and linguistic features from the target sentences. To enhance zero-shot translation, models need to share knowledge across languages, which can be achieved through auxiliary tasks for learning a universal representation or cross-lingual mapping. To this end, we propose to exploit both semantic and linguistic features between multiple languages to enhance multilingual translation. On the encoder side, we introduce a disentangling learning task that aligns encoder representations by disentangling semantic and linguistic features, thus facilitating knowledge transfer while preserving complete information. On the decoder side, we leverage a linguistic encoder to integrate low-level linguistic features to assist in the target language generation. Experimental results on multilingual datasets demonstrate significant improvement in zero-shot translation compared to the baseline system, while maintaining performance in supervised translation. Further analysis validates the effectiveness of our method in leveraging both semantic and linguistic features. The code is available at https://github.com/ictnlp/SemLing-MNMT.

Related papers

Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training [58.696660064190475]
We find that the existence of code-switching, alternating between different languages within a context, is key to multilingual capabilities. To better explore the power of code-switching for language alignment during pre-training, we investigate the strategy of synthetic code-switching.
arXiv Detail & Related papers (2025-04-02T15:09:58Z)
UniPSDA: Unsupervised Pseudo Semantic Data Augmentation for Zero-Shot Cross-Lingual Natural Language Understanding [31.272603877215733]
Cross-lingual representation learning transfers knowledge from resource-rich data to resource-scarce ones to improve the semantic understanding abilities of different languages. We propose an Unsupervised Pseudo Semantic Data Augmentation (UniPSDA) mechanism for cross-lingual natural language understanding to enrich the training data without human interventions.
arXiv Detail & Related papers (2024-06-24T07:27:01Z)
Towards a Deep Understanding of Multilingual End-to-End Speech Translation [52.26739715012842]
We analyze representations learnt in a multilingual end-to-end speech translation model trained over 22 languages. We derive three major findings from our analysis.
arXiv Detail & Related papers (2023-10-31T13:50:55Z)
Informative Language Representation Learning for Massively Multilingual Neural Machine Translation [47.19129812325682]
In a multilingual neural machine translation model, an artificial language token is usually used to guide translation into the desired target language. Recent studies show that prepending language tokens sometimes fails to navigate the multilingual neural machine translation models into right translation directions. We propose two methods, language embedding embodiment and language-aware multi-head attention, to learn informative language representations to channel translation into right directions.
arXiv Detail & Related papers (2022-09-04T04:27:17Z)
Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders [85.80950708769923]
We probe multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs. We also devise a novel method to expose this knowledge by additionally fine-tuning multilingual models. We report substantial gains on standard benchmarks.
arXiv Detail & Related papers (2022-04-30T13:23:16Z)
VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages. It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language. The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z)
FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning. During inference, the model makes predictions based on the text input in the target language and its translation in the source language. To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)
CSTNet: Contrastive Speech Translation Network for Self-Supervised Speech Representation Learning [11.552745999302905]
More than half of the 7,000 languages in the world are in imminent danger of going extinct. It is relatively easy to obtain textual translations corresponding to speech. We construct a convolutional neural network audio encoder capable of extracting linguistic representations from speech.
arXiv Detail & Related papers (2020-06-04T12:21:48Z)
Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source. We observe that our representations embed typology and strengthen correlations with language relationships. We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.