Exploring the Representation of Word Meanings in Context: A Case Study
on Homonymy and Synonymy
- URL: http://arxiv.org/abs/2106.13553v2
- Date: Tue, 29 Jun 2021 07:33:40 GMT
- Title: Exploring the Representation of Word Meanings in Context: A Case Study
on Homonymy and Synonymy
- Authors: Marcos Garcia
- Abstract summary: We assess the ability of both static and contextualized models to adequately represent different lexical-semantic relations.
Experiments are performed in Galician, Portuguese, English, and Spanish.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper presents a multilingual study of word meaning representations in
context. We assess the ability of both static and contextualized models to
adequately represent different lexical-semantic relations, such as homonymy and
synonymy. To do so, we created a new multilingual dataset that allows us to
perform a controlled evaluation of several factors such as the impact of the
surrounding context or the overlap between words, conveying the same or
different senses. A systematic assessment on four scenarios shows that the best
monolingual models based on Transformers can adequately disambiguate homonyms
in context. However, as they rely heavily on context, these models fail at
representing words with different senses when occurring in similar sentences.
Experiments are performed in Galician, Portuguese, English, and Spanish, and
both the dataset (with more than 3,000 evaluation items) and new models are
freely released with this study.
Related papers
- Bidirectional Transformer Representations of (Spanish) Ambiguous Words in Context: A New Lexical Resource and Empirical Analysis [2.2530496464901106]
Few studies have compared large language models' contextualized word embeddings for languages beyond English.
We evaluate multiple bidirectional transformers' (BERTs') semantic representations of Spanish ambiguous nouns in context.
We find that various BERT-based LLMs' contextualized semantic representations capture some variance in human judgments but fall short of the human benchmark.
arXiv Detail & Related papers (2024-06-20T18:58:11Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - Shades of meaning: Uncovering the geometry of ambiguous word
representations through contextualised language models [6.760960482418417]
Lexical ambiguity presents a profound and enduring challenge to the language sciences.
Our work offers new insight into psychological understanding of lexical ambiguity through a series of simulations.
arXiv Detail & Related papers (2023-04-26T14:47:38Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Lost in Context? On the Sense-wise Variance of Contextualized Word
Embeddings [11.475144702935568]
We quantify how much the contextualized embeddings of each word sense vary across contexts in typical pre-trained models.
We find that word representations are position-biased, where the first words in different contexts tend to be more similar.
arXiv Detail & Related papers (2022-08-20T12:27:25Z) - Visual Comparison of Language Model Adaptation [55.92129223662381]
adapters are lightweight alternatives for model adaptation.
In this paper, we discuss several design and alternatives for interactive, comparative visual explanation methods.
We show that, for instance, an adapter trained on the language debiasing task according to context-0 embeddings introduces a new type of bias.
arXiv Detail & Related papers (2022-08-17T09:25:28Z) - Patterns of Lexical Ambiguity in Contextualised Language Models [9.747449805791092]
We introduce an extended, human-annotated dataset of graded word sense similarity and co-predication.
Both types of human judgements indicate that the similarity of polysemic interpretations falls in a continuum between identity of meaning and homonymy.
Our dataset appears to capture a substantial part of the complexity of lexical ambiguity, and can provide a realistic test bed for contextualised embeddings.
arXiv Detail & Related papers (2021-09-27T13:11:44Z) - Understanding Synonymous Referring Expressions via Contrastive Features [105.36814858748285]
We develop an end-to-end trainable framework to learn contrastive features on the image and object instance levels.
We conduct extensive experiments to evaluate the proposed algorithm on several benchmark datasets.
arXiv Detail & Related papers (2021-04-20T17:56:24Z) - XL-WiC: A Multilingual Benchmark for Evaluating Semantic
Contextualization [98.61159823343036]
We present the Word-in-Context dataset (WiC) for assessing the ability to correctly model distinct meanings of a word.
We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages.
Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance.
arXiv Detail & Related papers (2020-10-13T15:32:00Z) - Probing Contextual Language Models for Common Ground with Visual
Representations [76.05769268286038]
We design a probing model that evaluates how effective are text-only representations in distinguishing between matching and non-matching visual representations.
Our findings show that language representations alone provide a strong signal for retrieving image patches from the correct object categories.
Visually grounded language models slightly outperform text-only language models in instance retrieval, but greatly under-perform humans.
arXiv Detail & Related papers (2020-05-01T21:28:28Z) - Sentiment Analysis with Contextual Embeddings and Self-Attention [3.0079490585515343]
In natural language the intended meaning of a word or phrase is often implicit and depends on the context.
We propose a simple yet effective method for sentiment analysis using contextual embeddings and a self-attention mechanism.
The experimental results for three languages, including morphologically rich Polish and German, show that our model is comparable to or even outperforms state-of-the-art models.
arXiv Detail & Related papers (2020-03-12T02:19:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.