Leveraging Multi-lingual Positive Instances in Contrastive Learning to
Improve Sentence Embedding
- URL: http://arxiv.org/abs/2309.08929v2
- Date: Wed, 31 Jan 2024 14:25:15 GMT
- Title: Leveraging Multi-lingual Positive Instances in Contrastive Learning to
Improve Sentence Embedding
- Authors: Kaiyan Zhao, Qiyu Wu, Xin-Qiang Cai, Yoshimasa Tsuruoka
- Abstract summary: We argue that leveraging multiple positives should be considered for multi-lingual sentence embeddings.
We propose a novel approach, named MPCL, to effectively utilize multiple positive instances to improve the learning of multi-lingual sentence embeddings.
- Score: 17.12010497289781
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning multi-lingual sentence embeddings is a fundamental task in natural
language processing. Recent trends in learning both mono-lingual and
multi-lingual sentence embeddings are mainly based on contrastive learning (CL)
among an anchor, one positive, and multiple negative instances. In this work,
we argue that leveraging multiple positives should be considered for
multi-lingual sentence embeddings because (1) positives in a diverse set of
languages can benefit cross-lingual learning, and (2) transitive similarity
across multiple positives can provide reliable structural information for
learning. In order to investigate the impact of multiple positives in CL, we
propose a novel approach, named MPCL, to effectively utilize multiple positive
instances to improve the learning of multi-lingual sentence embeddings.
Experimental results on various backbone models and downstream tasks
demonstrate that MPCL leads to better retrieval, semantic similarity, and
classification performances compared to conventional CL. We also observe that
in unseen languages, sentence embedding models trained on multiple positives
show better cross-lingual transfer performance than models trained on a single
positive instance.
Related papers
- Improving Multi-lingual Alignment Through Soft Contrastive Learning [9.454626745893798]
We propose a novel method to align multi-lingual embeddings based on the similarity of sentences measured by a pre-trained mono-lingual embedding model.
Given translation sentence pairs, we train a multi-lingual model in a way that the similarity between cross-lingual embeddings follows the similarity of sentences measured at the mono-lingual teacher model.
arXiv Detail & Related papers (2024-05-25T09:46:07Z) - Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual Alignment [42.624862172666624]
We propose a simple yet effective cross-lingual alignment framework exploiting pairs of translation sentences.
It aligns the internal sentence representations across different languages via multilingual contrastive learning.
Experimental results show that even with less than 0.1 textperthousand of pre-training tokens, our alignment framework significantly boosts the cross-lingual abilities of generative language models.
arXiv Detail & Related papers (2023-11-14T11:24:08Z) - VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - A Multi-level Supervised Contrastive Learning Framework for Low-Resource
Natural Language Inference [54.678516076366506]
Natural Language Inference (NLI) is a growingly essential task in natural language understanding.
Here we propose a multi-level supervised contrastive learning framework named MultiSCL for low-resource natural language inference.
arXiv Detail & Related papers (2022-05-31T05:54:18Z) - Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence
Encoders [85.80950708769923]
We probe multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs.
We also devise a novel method to expose this knowledge by additionally fine-tuning multilingual models.
We report substantial gains on standard benchmarks.
arXiv Detail & Related papers (2022-04-30T13:23:16Z) - Multi-Level Contrastive Learning for Cross-Lingual Alignment [35.33431650608965]
Cross-language pre-trained models such as multilingual BERT (mBERT) have achieved significant performance in various cross-lingual downstream NLP tasks.
This paper proposes a multi-level contrastive learning framework to further improve the cross-lingual ability of pre-trained models.
arXiv Detail & Related papers (2022-02-26T07:14:20Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - On Negative Interference in Multilingual Models: Findings and A
Meta-Learning Treatment [59.995385574274785]
We show that, contrary to previous belief, negative interference also impacts low-resource languages.
We present a meta-learning algorithm that obtains better cross-lingual transferability and alleviates negative interference.
arXiv Detail & Related papers (2020-10-06T20:48:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.