Combining Static Word Embeddings and Contextual Representations for
Bilingual Lexicon Induction
- URL: http://arxiv.org/abs/2106.03084v2
- Date: Thu, 10 Jun 2021 06:48:05 GMT
- Title: Combining Static Word Embeddings and Contextual Representations for
Bilingual Lexicon Induction
- Authors: Jinpeng Zhang, Baijun Ji, Nini Xiao, Xiangyu Duan, Min Zhang, Yangbin
Shi, Weihua Luo
- Abstract summary: We propose a simple yet effective mechanism to combine the static word embeddings and the contextual representations.
We test the combination mechanism on various language pairs under the supervised and unsupervised BLI benchmark settings.
- Score: 19.375597786174197
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bilingual Lexicon Induction (BLI) aims to map words in one language to their
translations in another, and is typically through learning linear projections
to align monolingual word representation spaces. Two classes of word
representations have been explored for BLI: static word embeddings and
contextual representations, but there is no studies to combine both. In this
paper, we propose a simple yet effective mechanism to combine the static word
embeddings and the contextual representations to utilize the advantages of both
paradigms. We test the combination mechanism on various language pairs under
the supervised and unsupervised BLI benchmark settings. Experiments show that
our mechanism consistently improves performances over robust BLI baselines on
all language pairs by averagely improving 3.2 points in the supervised setting,
and 3.1 points in the unsupervised setting.
Related papers
- On Bilingual Lexicon Induction with Large Language Models [81.6546357879259]
We examine the potential of the latest generation of Large Language Models for the development of bilingual lexicons.
We study 1) zero-shot prompting for unsupervised BLI and 2) few-shot in-context prompting with a set of seed translation pairs.
Our work is the first to demonstrate strong BLI capabilities of text-to-text mLLMs.
arXiv Detail & Related papers (2023-10-21T12:43:27Z) - Utilizing Language-Image Pretraining for Efficient and Robust Bilingual
Word Alignment [27.405171616881322]
We develop a novel UWT method dubbed Word Alignment using Language-Image Pretraining (WALIP)
WALIP uses visual observations via the shared embedding space of images and texts provided by CLIP models.
Our experiments show that WALIP improves upon the state-of-the-art performance of bilingual word alignment for a few language pairs.
arXiv Detail & Related papers (2022-05-23T20:29:26Z) - Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence
Encoders [85.80950708769923]
We probe multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs.
We also devise a novel method to expose this knowledge by additionally fine-tuning multilingual models.
We report substantial gains on standard benchmarks.
arXiv Detail & Related papers (2022-04-30T13:23:16Z) - Improving Word Translation via Two-Stage Contrastive Learning [46.71404992627519]
We propose a robust and effective two-stage contrastive learning framework for the BLI task.
Comprehensive experiments on standard BLI datasets for diverse languages show substantial gains achieved by our framework.
arXiv Detail & Related papers (2022-03-15T22:51:22Z) - Word Alignment by Fine-tuning Embeddings on Parallel Corpora [96.28608163701055]
Word alignment over parallel corpora has a wide variety of applications, including learning translation lexicons, cross-lingual transfer of language processing tools, and automatic evaluation or analysis of translation outputs.
Recently, other work has demonstrated that pre-trained contextualized word embeddings derived from multilingually trained language models (LMs) prove an attractive alternative, achieving competitive results on the word alignment task even in the absence of explicit training on parallel data.
In this paper, we examine methods to marry the two approaches: leveraging pre-trained LMs but fine-tuning them on parallel text with objectives designed to improve alignment quality, and proposing
arXiv Detail & Related papers (2021-01-20T17:54:47Z) - Bilingual Lexicon Induction via Unsupervised Bitext Construction and
Word Alignment [49.3253280592705]
We show it is possible to produce much higher quality lexicons with methods that combine bitext mining and unsupervised word alignment.
Our final model outperforms the state of the art on the BUCC 2020 shared task by 14 $F_1$ points averaged over 12 language pairs.
arXiv Detail & Related papers (2021-01-01T03:12:42Z) - Beyond Offline Mapping: Learning Cross Lingual Word Embeddings through
Context Anchoring [41.77270308094212]
We propose an alternative mapping approach for word embeddings in languages other than English.
Rather than aligning two fixed embedding spaces, our method works by fixing the target language embeddings, and learning a new set of embeddings for the source language that are aligned with them.
Our approach outperforms conventional mapping methods on bilingual lexicon induction, and obtains competitive results in the downstream XNLI task.
arXiv Detail & Related papers (2020-12-31T17:10:14Z) - On the Language Neutrality of Pre-trained Multilingual Representations [70.93503607755055]
We investigate the language-neutrality of multilingual contextual embeddings directly and with respect to lexical semantics.
Our results show that contextual embeddings are more language-neutral and, in general, more informative than aligned static word-type embeddings.
We show how to reach state-of-the-art accuracy on language identification and match the performance of statistical methods for word alignment of parallel sentences.
arXiv Detail & Related papers (2020-04-09T19:50:32Z) - Multilingual Alignment of Contextual Word Representations [49.42244463346612]
BERT exhibits significantly improved zero-shot performance on XNLI compared to the base model.
We introduce a contextual version of word retrieval and show that it correlates well with downstream zero-shot transfer.
These results support contextual alignment as a useful concept for understanding large multilingual pre-trained models.
arXiv Detail & Related papers (2020-02-10T03:27:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.