Related papers: Exploring the Representation of Word Meanings in Context: A Case Study on Homonymy and Synonymy

Exploring the Representation of Word Meanings in Context: A Case Study on Homonymy and Synonymy

URL: http://arxiv.org/abs/2106.13553v2
Date: Tue, 29 Jun 2021 07:33:40 GMT
Title: Exploring the Representation of Word Meanings in Context: A Case Study on Homonymy and Synonymy
Authors: Marcos Garcia
Abstract summary: We assess the ability of both static and contextualized models to adequately represent different lexical-semantic relations. Experiments are performed in Galician, Portuguese, English, and Spanish.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: This paper presents a multilingual study of word meaning representations in context. We assess the ability of both static and contextualized models to adequately represent different lexical-semantic relations, such as homonymy and synonymy. To do so, we created a new multilingual dataset that allows us to perform a controlled evaluation of several factors such as the impact of the surrounding context or the overlap between words, conveying the same or different senses. A systematic assessment on four scenarios shows that the best monolingual models based on Transformers can adequately disambiguate homonyms in context. However, as they rely heavily on context, these models fail at representing words with different senses when occurring in similar sentences. Experiments are performed in Galician, Portuguese, English, and Spanish, and both the dataset (with more than 3,000 evaluation items) and new models are freely released with this study.

Related papers

CALE : Concept-Aligned Embeddings for Both Within-Lemma and Inter-Lemma Sense Differentiation [0.0]
Lexical semantics is concerned with both the multiple senses a word can adopt in different contexts, and the semantic relations that exist between meanings of different words.<n>To investigate them, Contextualized Language Models are a valuable tool that provides context-sensitive representations.<n>We propose an extension, Concept Differentiation, to include inter-words scenarios.
arXiv Detail & Related papers (2025-08-06T14:43:22Z)
Investigating Idiomaticity in Word Representations [9.208145117062339]
We focus on noun compounds of varying levels of idiomaticity in two languages (English and Portuguese) We present a dataset of minimal pairs containing human idiomaticity judgments for each noun compound at both type and token levels. We define a set of fine-grained metrics of Affinity and Scaled Similarity to determine how sensitive the models are to perturbations that may lead to changes in idiomaticity.
arXiv Detail & Related papers (2024-11-04T21:05:01Z)
Conjuring Semantic Similarity [59.18714889874088]
The semantic similarity between two textual expressions measures the distance between their latent'meaning' We propose a novel approach whereby the semantic similarity among textual expressions is based not on other expressions they can be rephrased as, but rather based on the imagery they evoke. Our method contributes a novel perspective on semantic similarity that not only aligns with human-annotated scores, but also opens up new avenues for the evaluation of text-conditioned generative models.
arXiv Detail & Related papers (2024-10-21T18:51:34Z)
Evaluating Contextualized Representations of (Spanish) Ambiguous Words: A New Lexical Resource and Empirical Analysis [2.2530496464901106]
We evaluate semantic representations of Spanish ambiguous nouns in context in a suite of Spanish-language monolingual and multilingual BERT-based models. We find that various BERT-based LMs' contextualized semantic representations capture some variance in human judgments but fall short of the human benchmark.
arXiv Detail & Related papers (2024-06-20T18:58:11Z)
Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions. This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z)
Shades of meaning: Uncovering the geometry of ambiguous word representations through contextualised language models [6.760960482418417]
Lexical ambiguity presents a profound and enduring challenge to the language sciences. Our work offers new insight into psychological understanding of lexical ambiguity through a series of simulations.
arXiv Detail & Related papers (2023-04-26T14:47:38Z)
Lost in Context? On the Sense-wise Variance of Contextualized Word Embeddings [11.475144702935568]
We quantify how much the contextualized embeddings of each word sense vary across contexts in typical pre-trained models. We find that word representations are position-biased, where the first words in different contexts tend to be more similar.
arXiv Detail & Related papers (2022-08-20T12:27:25Z)
Visual Comparison of Language Model Adaptation [55.92129223662381]
adapters are lightweight alternatives for model adaptation. In this paper, we discuss several design and alternatives for interactive, comparative visual explanation methods. We show that, for instance, an adapter trained on the language debiasing task according to context-0 embeddings introduces a new type of bias.
arXiv Detail & Related papers (2022-08-17T09:25:28Z)
Patterns of Lexical Ambiguity in Contextualised Language Models [9.747449805791092]
We introduce an extended, human-annotated dataset of graded word sense similarity and co-predication. Both types of human judgements indicate that the similarity of polysemic interpretations falls in a continuum between identity of meaning and homonymy. Our dataset appears to capture a substantial part of the complexity of lexical ambiguity, and can provide a realistic test bed for contextualised embeddings.
arXiv Detail & Related papers (2021-09-27T13:11:44Z)
Understanding Synonymous Referring Expressions via Contrastive Features [105.36814858748285]
We develop an end-to-end trainable framework to learn contrastive features on the image and object instance levels. We conduct extensive experiments to evaluate the proposed algorithm on several benchmark datasets.
arXiv Detail & Related papers (2021-04-20T17:56:24Z)
XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization [98.61159823343036]
We present the Word-in-Context dataset (WiC) for assessing the ability to correctly model distinct meanings of a word. We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages. Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance.
arXiv Detail & Related papers (2020-10-13T15:32:00Z)
Probing Contextual Language Models for Common Ground with Visual Representations [76.05769268286038]
We design a probing model that evaluates how effective are text-only representations in distinguishing between matching and non-matching visual representations. Our findings show that language representations alone provide a strong signal for retrieving image patches from the correct object categories. Visually grounded language models slightly outperform text-only language models in instance retrieval, but greatly under-perform humans.
arXiv Detail & Related papers (2020-05-01T21:28:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.