LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and
Beyond
- URL: http://arxiv.org/abs/2105.12449v1
- Date: Wed, 26 May 2021 10:14:22 GMT
- Title: LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and
Beyond
- Authors: Daniel Loureiro, Al\'ipio M\'ario Jorge, Jose Camacho-Collados
- Abstract summary: Recent Transformer-based Language Models have proven capable of producing contextual word representations that reliably convey sense-specific information.
We introduce a more principled approach to leverage information from all layers of NLMs, informed by a probing analysis on 14 NLM variants.
We also emphasize the versatility of these sense embeddings in contrast to task-specific models, applying them on several sense-related tasks, besides WSD.
- Score: 2.9005223064604078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distributional semantics based on neural approaches is a cornerstone of
Natural Language Processing, with surprising connections to human meaning
representation as well. Recent Transformer-based Language Models have proven
capable of producing contextual word representations that reliably convey
sense-specific information, simply as a product of self-supervision. Prior work
has shown that these contextual representations can be used to accurately
represent large sense inventories as sense embeddings, to the extent that a
distance-based solution to Word Sense Disambiguation (WSD) tasks outperforms
models trained specifically for the task. Still, there remains much to
understand on how to use these Neural Language Models (NLMs) to produce sense
embeddings that can better harness each NLM's meaning representation abilities.
In this work we introduce a more principled approach to leverage information
from all layers of NLMs, informed by a probing analysis on 14 NLM variants. We
also emphasize the versatility of these sense embeddings in contrast to
task-specific models, applying them on several sense-related tasks, besides
WSD, while demonstrating improved performance using our proposed approach over
prior work focused on sense embeddings. Finally, we discuss unexpected findings
regarding layer and model performance variations, and potential applications
for downstream tasks.
Related papers
- Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning [79.38140606606126]
We propose an algorithmic framework that fine-tunes vision-language models (VLMs) with reinforcement learning (RL)
Our framework provides a task description and then prompts the VLM to generate chain-of-thought (CoT) reasoning.
We demonstrate that our proposed framework enhances the decision-making capabilities of VLM agents across various tasks.
arXiv Detail & Related papers (2024-05-16T17:50:19Z) - Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception [63.03288425612792]
We propose bfAnyRef, a general MLLM model that can generate pixel-wise object perceptions and natural language descriptions from multi-modality references.
Our model achieves state-of-the-art results across multiple benchmarks, including diverse modality referring segmentation and region-level referring expression generation.
arXiv Detail & Related papers (2024-03-05T13:45:46Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Language Models as Knowledge Bases for Visual Word Sense Disambiguation [1.8591405259852054]
We propose some knowledge-enhancement techniques towards improving the retrieval performance of visiolinguistic (VL) transformers.
More specifically, knowledge stored in Large Language Models (LLMs) is retrieved with the help of appropriate prompts in a zero-shot manner.
Our presented approach is the first one to analyze the merits of exploiting knowledge stored in LLMs in different ways to solve Visual Word Sense Disambiguation.
arXiv Detail & Related papers (2023-10-03T11:11:55Z) - Dynamic Prompting: A Unified Framework for Prompt Tuning [33.175097465669374]
We present a unified dynamic prompt (DP) tuning strategy that dynamically determines different factors of prompts based on specific tasks and instances.
Experimental results underscore the significant performance improvement achieved by dynamic prompt tuning across a wide range of tasks.
We establish the universal applicability of our approach under full-data, few-shot, and multitask scenarios.
arXiv Detail & Related papers (2023-03-06T06:04:46Z) - Sense representations for Portuguese: experiments with sense embeddings
and deep neural language models [0.0]
Unsupervised sense representations can induce different senses of a word by analyzing its contextual semantics in a text.
We present the first experiments carried out for generating sense embeddings for Portuguese.
arXiv Detail & Related papers (2021-08-31T18:07:01Z) - Meta-Learning with Variational Semantic Memory for Word Sense
Disambiguation [56.830395467247016]
We propose a model of semantic memory for WSD in a meta-learning setting.
Our model is based on hierarchical variational inference and incorporates an adaptive memory update rule via a hypernetwork.
We show our model advances the state of the art in few-shot WSD, supports effective learning in extremely data scarce scenarios.
arXiv Detail & Related papers (2021-06-05T20:40:01Z) - Training Bi-Encoders for Word Sense Disambiguation [4.149972584899897]
State-of-the-art approaches in Word Sense Disambiguation leverage lexical information along with pre-trained embeddings from these models to achieve results comparable to human inter-annotator agreement on standard evaluation benchmarks.
We further the state of the art in Word Sense Disambiguation through our multi-stage pre-training and fine-tuning pipeline.
arXiv Detail & Related papers (2021-05-21T06:06:03Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Making Sense of CNNs: Interpreting Deep Representations & Their
Invariances with INNs [19.398202091883366]
We present an approach based on INNs that (i) recovers the task-specific, learned invariances by disentangling the remaining factor of variation in the data and that (ii) invertibly transforms these invariances combined with the model representation into an equally expressive one with accessible semantic concepts.
Our invertible approach significantly extends the abilities to understand black box models by enabling post-hoc interpretations of state-of-the-art networks without compromising their performance.
arXiv Detail & Related papers (2020-08-04T19:27:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.