Injecting Wiktionary to improve token-level contextual representations
using contrastive learning
- URL: http://arxiv.org/abs/2402.07817v1
- Date: Mon, 12 Feb 2024 17:22:42 GMT
- Title: Injecting Wiktionary to improve token-level contextual representations
using contrastive learning
- Authors: Anna Mosolova, Marie Candito, Carlos Ramisch
- Abstract summary: We investigate how to inject a lexicon as an alternative source of supervision, using the English Wiktionary.
We also test how dimensionality reduction impacts the resulting contextual word embeddings.
- Score: 2.761009930426063
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While static word embeddings are blind to context, for lexical semantics
tasks context is rather too present in contextual word embeddings, vectors of
same-meaning occurrences being too different (Ethayarajh, 2019). Fine-tuning
pre-trained language models (PLMs) using contrastive learning was proposed,
leveraging automatically self-augmented examples (Liu et al., 2021b). In this
paper, we investigate how to inject a lexicon as an alternative source of
supervision, using the English Wiktionary. We also test how dimensionality
reduction impacts the resulting contextual word embeddings. We evaluate our
approach on the Word-In-Context (WiC) task, in the unsupervised setting (not
using the training set). We achieve new SoTA result on the original WiC test
set. We also propose two new WiC test sets for which we show that our
fine-tuning method achieves substantial improvements. We also observe
improvements, although modest, for the semantic frame induction task. Although
we experimented on English to allow comparison with related work, our method is
adaptable to the many languages for which large Wiktionaries exist.
Related papers
- Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Sentence Representation Learning with Generative Objective rather than
Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z) - Contrastive Learning for Context-aware Neural Machine TranslationUsing
Coreference Information [14.671424999873812]
We propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences.
By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency.
In experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks.
arXiv Detail & Related papers (2021-09-13T05:18:47Z) - Aligning Cross-lingual Sentence Representations with Dual Momentum
Contrast [12.691501386854094]
We propose to align sentence representations from different languages into a unified embedding space, where semantic similarities can be computed with a simple dot product.
As the experimental results show, the sentence representations produced by our model achieve the new state-of-the-art on several tasks.
arXiv Detail & Related papers (2021-09-01T08:48:34Z) - Denoising Word Embeddings by Averaging in a Shared Space [34.175826109538676]
We introduce a new approach for smoothing and improving the quality of word embeddings.
We project all the models to a shared vector space using an efficient implementation of the Generalized Procrustes Analysis (GPA) procedure.
As the new representations are more stable and reliable, there is a noticeable improvement in rare word evaluations.
arXiv Detail & Related papers (2021-06-05T19:49:02Z) - Word2rate: training and evaluating multiple word embeddings as
statistical transitions [4.350783459690612]
We introduce a novel left-right context split objective that improves performance for tasks sensitive to word order.
Our Word2rate model is grounded in a statistical foundation using rate matrices while being competitive in variety of language tasks.
arXiv Detail & Related papers (2021-04-16T15:31:29Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - Beyond Offline Mapping: Learning Cross Lingual Word Embeddings through
Context Anchoring [41.77270308094212]
We propose an alternative mapping approach for word embeddings in languages other than English.
Rather than aligning two fixed embedding spaces, our method works by fixing the target language embeddings, and learning a new set of embeddings for the source language that are aligned with them.
Our approach outperforms conventional mapping methods on bilingual lexicon induction, and obtains competitive results in the downstream XNLI task.
arXiv Detail & Related papers (2020-12-31T17:10:14Z) - Learning Contextualised Cross-lingual Word Embeddings and Alignments for
Extremely Low-Resource Languages Using Parallel Corpora [63.5286019659504]
We propose a new approach for learning contextualised cross-lingual word embeddings based on a small parallel corpus.
Our method obtains word embeddings via an LSTM encoder-decoder model that simultaneously translates and reconstructs an input sentence.
arXiv Detail & Related papers (2020-10-27T22:24:01Z) - Words aren't enough, their order matters: On the Robustness of Grounding
Visual Referring Expressions [87.33156149634392]
We critically examine RefCOg, a standard benchmark for visual referring expression recognition.
We show that 83.7% of test instances do not require reasoning on linguistic structure.
We propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT.
arXiv Detail & Related papers (2020-05-04T17:09:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.