A Discriminative Latent-Variable Model for Bilingual Lexicon Induction
- URL: http://arxiv.org/abs/1808.09334v3
- Date: Sun, 10 Mar 2024 18:35:27 GMT
- Title: A Discriminative Latent-Variable Model for Bilingual Lexicon Induction
- Authors: Sebastian Ruder, Ryan Cotterell, Yova Kementchedjhieva, Anders
S{\o}gaard
- Abstract summary: We introduce a novel discriminative latent variable model for bilingual lexicon induction.
Our model combines the bipartite matching dictionary prior of Haghighi et al. (2008) with a representation-based approach.
- Score: 100.76471407472599
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel discriminative latent variable model for bilingual
lexicon induction. Our model combines the bipartite matching dictionary prior
of Haghighi et al. (2008) with a representation-based approach (Artetxe et al.,
2017). To train the model, we derive an efficient Viterbi EM algorithm. We
provide empirical results on six language pairs under two metrics and show that
the prior improves the induced bilingual lexicons. We also demonstrate how
previous work may be viewed as a similarly fashioned latent-variable model,
albeit with a different prior.
Related papers
- VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Entity-Assisted Language Models for Identifying Check-worthy Sentences [23.792877053142636]
We propose a new uniform framework for text classification and ranking.
Our framework combines the semantic analysis of the sentences, with additional entity embeddings obtained through the identified entities within the sentences.
We extensively evaluate the effectiveness of our framework using two publicly available datasets from the CLEF's 2019 & 2020 CheckThat! Labs.
arXiv Detail & Related papers (2022-11-19T12:03:30Z) - Improving the Lexical Ability of Pretrained Language Models for
Unsupervised Neural Machine Translation [127.81351683335143]
Cross-lingual pretraining requires models to align the lexical- and high-level representations of the two languages.
Previous research has shown that this is because the representations are not sufficiently aligned.
In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings.
arXiv Detail & Related papers (2021-03-18T21:17:58Z) - Towards Multi-Sense Cross-Lingual Alignment of Contextual Embeddings [41.148892848434585]
We propose a novel framework to align contextual embeddings at the sense level by leveraging cross-lingual signal from bilingual dictionaries only.
We operationalize our framework by first proposing a novel sense-aware cross entropy loss to model word senses explicitly.
We then propose a sense alignment objective on top of the sense-aware cross entropy loss for cross-lingual model pretraining, and pretrain cross-lingual models for several language pairs.
arXiv Detail & Related papers (2021-03-11T04:55:35Z) - Morphologically Aware Word-Level Translation [82.59379608647147]
We propose a novel morphologically aware probability model for bilingual lexicon induction.
Our model exploits the basic linguistic intuition that the lexeme is the key lexical unit of meaning.
arXiv Detail & Related papers (2020-11-15T17:54:49Z) - InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language
Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training.
We propose a new pre-training task based on contrastive learning.
By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z) - In search of isoglosses: continuous and discrete language embeddings in
Slavic historical phonology [0.0]
We employ three different types of language embedding (dense, sigmoid, and straight-through)
We find that the Straight-Through model outperforms the other two in terms of accuracy, but the Sigmoid model's language embeddings show the strongest agreement with the traditional subgrouping of the Slavic languages.
arXiv Detail & Related papers (2020-05-27T18:10:46Z) - Overestimation of Syntactic Representationin Neural Language Models [16.765097098482286]
One popular method for determining a model's ability to induce syntactic structure trains a model on strings generated according to a template then tests the model's ability to distinguish such strings from superficially similar ones with different syntax.
We illustrate a fundamental problem with this approach by reproducing positive results from a recent paper with two non-syntactic baseline language models.
arXiv Detail & Related papers (2020-04-10T15:13:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.