Using Optimal Transport as Alignment Objective for fine-tuning
Multilingual Contextualized Embeddings
- URL: http://arxiv.org/abs/2110.02887v1
- Date: Wed, 6 Oct 2021 16:13:45 GMT
- Title: Using Optimal Transport as Alignment Objective for fine-tuning
Multilingual Contextualized Embeddings
- Authors: Sawsan Alqahtani, Garima Lalwani, Yi Zhang, Salvatore Romeo, Saab
Mansour
- Abstract summary: We propose using Optimal Transport (OT) as an alignment objective during fine-tuning to improve multilingual contextualized representations.
This approach does not require word-alignment pairs prior to fine-tuning and instead learns the word alignments within context in an unsupervised manner.
- Score: 7.026476782041066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have proposed different methods to improve multilingual word
representations in contextualized settings including techniques that align
between source and target embedding spaces. For contextualized embeddings,
alignment becomes more complex as we additionally take context into
consideration. In this work, we propose using Optimal Transport (OT) as an
alignment objective during fine-tuning to further improve multilingual
contextualized representations for downstream cross-lingual transfer. This
approach does not require word-alignment pairs prior to fine-tuning that may
lead to sub-optimal matching and instead learns the word alignments within
context in an unsupervised manner. It also allows different types of mappings
due to soft matching between source and target sentences. We benchmark our
proposed method on two tasks (XNLI and XQuAD) and achieve improvements over
baselines as well as competitive results compared to similar recent works.
Related papers
- How Transliterations Improve Crosslingual Alignment [48.929677368744606]
Recent studies have shown that post-aligning multilingual pretrained language models (mPLMs) using alignment objectives can improve crosslingual alignment.
This paper attempts to explicitly evaluate the crosslingual alignment and identify the key elements in transliteration-based approaches that contribute to better performance.
arXiv Detail & Related papers (2024-09-25T20:05:45Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment [63.0407314271459]
The proposed Cross-Align achieves the state-of-the-art (SOTA) performance on four out of five language pairs.
Experiments show that the proposed Cross-Align achieves the state-of-the-art (SOTA) performance on four out of five language pairs.
arXiv Detail & Related papers (2022-10-09T02:24:35Z) - Word Alignment by Fine-tuning Embeddings on Parallel Corpora [96.28608163701055]
Word alignment over parallel corpora has a wide variety of applications, including learning translation lexicons, cross-lingual transfer of language processing tools, and automatic evaluation or analysis of translation outputs.
Recently, other work has demonstrated that pre-trained contextualized word embeddings derived from multilingually trained language models (LMs) prove an attractive alternative, achieving competitive results on the word alignment task even in the absence of explicit training on parallel data.
In this paper, we examine methods to marry the two approaches: leveraging pre-trained LMs but fine-tuning them on parallel text with objectives designed to improve alignment quality, and proposing
arXiv Detail & Related papers (2021-01-20T17:54:47Z) - Subword Sampling for Low Resource Word Alignment [4.663577299263155]
We propose subword sampling-based alignment of text units.
We show that the subword sampling method consistently outperforms word-level alignment on six language pairs.
arXiv Detail & Related papers (2020-12-21T19:47:04Z) - Unsupervised Word Translation Pairing using Refinement based Point Set
Registration [8.568050813210823]
Cross-lingual alignment of word embeddings play an important role in knowledge transfer across languages.
Current unsupervised approaches rely on similarities in geometric structure of word embedding spaces across languages.
This paper proposes BioSpere, a novel framework for unsupervised mapping of bi-lingual word embeddings onto a shared vector space.
arXiv Detail & Related papers (2020-11-26T09:51:29Z) - Cross-lingual Alignment Methods for Multilingual BERT: A Comparative
Study [2.101267270902429]
We analyse how different forms of cross-lingual supervision and various alignment methods influence the transfer capability of mBERT in zero-shot setting.
We find that supervision from parallel corpus is generally superior to dictionary alignments.
arXiv Detail & Related papers (2020-09-29T20:56:57Z) - Multilingual Alignment of Contextual Word Representations [49.42244463346612]
BERT exhibits significantly improved zero-shot performance on XNLI compared to the base model.
We introduce a contextual version of word retrieval and show that it correlates well with downstream zero-shot transfer.
These results support contextual alignment as a useful concept for understanding large multilingual pre-trained models.
arXiv Detail & Related papers (2020-02-10T03:27:21Z) - Robust Cross-lingual Embeddings from Parallel Sentences [65.85468628136927]
We propose a bilingual extension of the CBOW method which leverages sentence-aligned corpora to obtain robust cross-lingual word representations.
Our approach significantly improves crosslingual sentence retrieval performance over all other approaches.
It also achieves parity with a deep RNN method on a zero-shot cross-lingual document classification task.
arXiv Detail & Related papers (2019-12-28T16:18:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.