Do Explicit Alignments Robustly Improve Multilingual Encoders?
- URL: http://arxiv.org/abs/2010.02537v1
- Date: Tue, 6 Oct 2020 07:43:17 GMT
- Title: Do Explicit Alignments Robustly Improve Multilingual Encoders?
- Authors: Shijie Wu, Mark Dredze
- Abstract summary: multilingual encoders can effectively learn cross-lingual representation.
Explicit alignment objectives based on bitexts like Europarl or MultiUN have been shown to further improve these representations.
We propose a new contrastive alignment objective that can better utilize such signal.
- Score: 22.954688396858085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multilingual BERT (mBERT), XLM-RoBERTa (XLMR) and other unsupervised
multilingual encoders can effectively learn cross-lingual representation.
Explicit alignment objectives based on bitexts like Europarl or MultiUN have
been shown to further improve these representations. However, word-level
alignments are often suboptimal and such bitexts are unavailable for many
languages. In this paper, we propose a new contrastive alignment objective that
can better utilize such signal, and examine whether these previous alignment
methods can be adapted to noisier sources of aligned data: a randomly sampled 1
million pair subset of the OPUS collection. Additionally, rather than report
results on a single dataset with a single model run, we report the mean and
standard derivation of multiple runs with different seeds, on four datasets and
tasks. Our more extensive analysis finds that, while our new objective
outperforms previous work, overall these methods do not improve performance
with a more robust evaluation framework. Furthermore, the gains from using a
better underlying model eclipse any benefits from alignment training. These
negative results dictate more care in evaluating these methods and suggest
limitations in applying explicit alignment objectives.
Related papers
- SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization [8.971234046933349]
Cross-lingual summarization (XLS) generates summaries in a language different from that of the input documents.
We propose revisiting the summarize-and-translate pipeline, where the summarization and translation tasks are performed in a sequence.
This approach allows reusing the many, publicly-available resources for monolingual summarization and translation, obtaining a very competitive zero-shot performance.
arXiv Detail & Related papers (2024-03-20T02:04:42Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Multilingual Sentence Transformer as A Multilingual Word Aligner [15.689680887384847]
We investigate whether multilingual sentence Transformer LaBSE is a strong multilingual word aligner.
Experiment results on seven language pairs show that our best aligner outperforms previous state-of-the-art models of all varieties.
Our aligner supports different language pairs in a single model, and even achieves new state-of-the-art on zero-shot language pairs that does not appear in the finetuning process.
arXiv Detail & Related papers (2023-01-28T09:28:55Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - On Cross-Lingual Retrieval with Multilingual Text Encoders [51.60862829942932]
We study the suitability of state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks.
We benchmark their performance in unsupervised ad-hoc sentence- and document-level CLIR experiments.
We evaluate multilingual encoders fine-tuned in a supervised fashion (i.e., we learn to rank) on English relevance data in a series of zero-shot language and domain transfer CLIR experiments.
arXiv Detail & Related papers (2021-12-21T08:10:27Z) - Distributionally Robust Multilingual Machine Translation [94.51866646879337]
We propose a new learning objective for Multilingual neural machine translation (MNMT) based on distributionally robust optimization.
We show how to practically optimize this objective for large translation corpora using an iterated best response scheme.
Our method consistently outperforms strong baseline methods in terms of average and per-language performance under both many-to-one and one-to-many translation settings.
arXiv Detail & Related papers (2021-09-09T03:48:35Z) - Word Alignment by Fine-tuning Embeddings on Parallel Corpora [96.28608163701055]
Word alignment over parallel corpora has a wide variety of applications, including learning translation lexicons, cross-lingual transfer of language processing tools, and automatic evaluation or analysis of translation outputs.
Recently, other work has demonstrated that pre-trained contextualized word embeddings derived from multilingually trained language models (LMs) prove an attractive alternative, achieving competitive results on the word alignment task even in the absence of explicit training on parallel data.
In this paper, we examine methods to marry the two approaches: leveraging pre-trained LMs but fine-tuning them on parallel text with objectives designed to improve alignment quality, and proposing
arXiv Detail & Related papers (2021-01-20T17:54:47Z) - Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR)
AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities.
Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.