Unsupervised Attention-based Sentence-Level Meta-Embeddings from
Contextualised Language Models
- URL: http://arxiv.org/abs/2204.07746v1
- Date: Sat, 16 Apr 2022 08:20:24 GMT
- Title: Unsupervised Attention-based Sentence-Level Meta-Embeddings from
Contextualised Language Models
- Authors: Keigo Takahashi and Danushka Bollegala
- Abstract summary: We propose a sentence-level meta-embedding learning method that takes independently trained contextualised word embedding models.
Our proposed method is unsupervised and is not tied to a particular downstream task.
Experimental results show that our proposed unsupervised sentence-level meta-embedding method outperforms previously proposed sentence-level meta-embedding methods.
- Score: 15.900069711477542
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A variety of contextualised language models have been proposed in the NLP
community, which are trained on diverse corpora to produce numerous Neural
Language Models (NLMs). However, different NLMs have reported different levels
of performances in downstream NLP applications when used as text
representations. We propose a sentence-level meta-embedding learning method
that takes independently trained contextualised word embedding models and
learns a sentence embedding that preserves the complementary strengths of the
input source NLMs. Our proposed method is unsupervised and is not tied to a
particular downstream task, which makes the learnt meta-embeddings in principle
applicable to different tasks that require sentence representations.
Specifically, we first project the token-level embeddings obtained by the
individual NLMs and learn attention weights that indicate the contributions of
source embeddings towards their token-level meta-embeddings. Next, we apply
mean and max pooling to produce sentence-level meta-embeddings from token-level
meta-embeddings. Experimental results on semantic textual similarity benchmarks
show that our proposed unsupervised sentence-level meta-embedding method
outperforms previously proposed sentence-level meta-embedding methods as well
as a supervised baseline.
Related papers
- Meta-Task Prompting Elicits Embeddings from Large Language Models [54.757445048329735]
We introduce a new unsupervised text embedding method, Meta-Task Prompting with Explicit One-Word Limitation.
We generate high-quality sentence embeddings from Large Language Models without the need for model fine-tuning.
Our findings suggest a new scaling law, offering a versatile and resource-efficient approach for embedding generation across diverse scenarios.
arXiv Detail & Related papers (2024-02-28T16:35:52Z) - Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets.
This survey delves into an important attribute of these datasets: the dialect of a language.
Motivated by the performance degradation of NLP models for dialectic datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z) - MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer [50.40191599304911]
We introduce MoSECroT Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer.
In this paper, we present the first framework that leverages relative representations to construct a common space for the embeddings of a source language PLM and the static word embeddings of a target language.
We show that although our proposed framework is competitive with weak baselines when addressing MoSECroT, it fails to achieve competitive results compared with some strong baselines.
arXiv Detail & Related papers (2024-01-09T21:09:07Z) - A Multi-level Supervised Contrastive Learning Framework for Low-Resource
Natural Language Inference [54.678516076366506]
Natural Language Inference (NLI) is a growingly essential task in natural language understanding.
Here we propose a multi-level supervised contrastive learning framework named MultiSCL for low-resource natural language inference.
arXiv Detail & Related papers (2022-05-31T05:54:18Z) - A Multilingual Perspective Towards the Evaluation of Attribution Methods
in Natural Language Inference [28.949004915740776]
We present a multilingual approach for evaluating attribution methods for the Natural Language Inference (NLI) task in terms of faithfulness and plausibility.
First, we introduce a novel cross-lingual strategy to measure faithfulness based on word alignments, which eliminates the drawbacks of erasure-based evaluations.
We then perform a comprehensive evaluation of attribution methods, considering different output mechanisms and aggregation methods.
arXiv Detail & Related papers (2022-04-11T22:11:05Z) - Skill Induction and Planning with Latent Language [94.55783888325165]
We formulate a generative model of action sequences in which goals generate sequences of high-level subtask descriptions.
We describe how to train this model using primarily unannotated demonstrations by parsing demonstrations into sequences of named high-level subtasks.
In trained models, the space of natural language commands indexes a library of skills; agents can use these skills to plan by generating high-level instruction sequences tailored to novel goals.
arXiv Detail & Related papers (2021-10-04T15:36:32Z) - Meta-Embeddings for Natural Language Inference and Semantic Similarity
tasks [0.0]
Word Representations form the core component for almost all advanced Natural Language Processing (NLP) applications.
In this paper, we propose to use Meta Embedding derived from few State-of-the-Art (SOTA) models to efficiently tackle mainstream NLP tasks.
arXiv Detail & Related papers (2020-12-01T16:58:01Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.