Semantics of Multiword Expressions in Transformer-Based Models: A Survey
- URL: http://arxiv.org/abs/2401.15393v1
- Date: Sat, 27 Jan 2024 11:51:11 GMT
- Title: Semantics of Multiword Expressions in Transformer-Based Models: A Survey
- Authors: Filip Mileti\'c, Sabine Schulte im Walde
- Abstract summary: Multiword expressions (MWEs) are composed of multiple words and exhibit variable degrees of compositionality.
We provide the first in-depth survey of MWE processing with transformer models.
We find that they capture MWE semantics inconsistently, as shown by reliance on surface patterns and memorized information.
- Score: 8.372465442144048
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multiword expressions (MWEs) are composed of multiple words and exhibit
variable degrees of compositionality. As such, their meanings are notoriously
difficult to model, and it is unclear to what extent this issue affects
transformer architectures. Addressing this gap, we provide the first in-depth
survey of MWE processing with transformer models. We overall find that they
capture MWE semantics inconsistently, as shown by reliance on surface patterns
and memorized information. MWE meaning is also strongly localized,
predominantly in early layers of the architecture. Representations benefit from
specific linguistic properties, such as lower semantic idiosyncrasy and
ambiguity of target expressions. Our findings overall question the ability of
transformer models to robustly capture fine-grained semantics. Furthermore, we
highlight the need for more directly comparable evaluation setups.
Related papers
- Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective [50.261681681643076]
We propose a novel metric called SemVarEffect and a benchmark named SemVarBench to evaluate the causality between semantic variations in inputs and outputs in text-to-image synthesis.
Our work establishes an effective evaluation framework that advances the T2I synthesis community's exploration of human instruction understanding.
arXiv Detail & Related papers (2024-10-14T08:45:35Z) - Learning Syntax Without Planting Trees: Understanding When and Why Transformers Generalize Hierarchically [74.96551626420188]
Transformers trained on natural language data have been shown to learn its hierarchical structure and generalize to sentences with unseen syntactic structures.
We investigate sources of inductive bias in transformer models and their training that could cause such generalization behavior to emerge.
arXiv Detail & Related papers (2024-04-25T07:10:29Z) - How Do Transformers Learn Topic Structure: Towards a Mechanistic
Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure"
We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z) - Transformer-based Detection of Multiword Expressions in Flower and Plant
Names [9.281156301926769]
Multiword expression (MWE) is a sequence of words which collectively present a meaning which is not derived from its individual words.
In this paper, we explore state-of-the-art neural transformers in the task of detecting MWEs in flower and plant names.
arXiv Detail & Related papers (2022-09-16T15:59:55Z) - BERT(s) to Detect Multiword Expressions [9.710464466895521]
Multiword expressions (MWEs) present groups of words in which the meaning of the whole is not derived from the meaning of its parts.
In this paper, we explore state-of-the-art neural transformers in the task of detecting MWEs.
arXiv Detail & Related papers (2022-08-16T16:32:23Z) - Learning Multiscale Transformer Models for Sequence Generation [33.73729074207944]
We build a multiscale Transformer model by establishing relationships among scales based on word-boundary information and phrase-level prior knowledge.
Notably, it yielded consistent performance gains over the strong baseline on several test sets without sacrificing the efficiency.
arXiv Detail & Related papers (2022-06-19T07:28:54Z) - Transformer Grammars: Augmenting Transformer Language Models with
Syntactic Inductive Biases at Scale [31.293175512404172]
We introduce Transformer Grammars -- a class of Transformer language models that combine expressive power, scalability, and strong performance of Transformers.
We find that Transformer Grammars outperform various strong baselines on multiple syntax-sensitive language modeling evaluation metrics.
arXiv Detail & Related papers (2022-03-01T17:22:31Z) - Did the Cat Drink the Coffee? Challenging Transformers with Generalized
Event Knowledge [59.22170796793179]
Transformers Language Models (TLMs) were tested on a benchmark for the textitdynamic estimation of thematic fit
Our results show that TLMs can reach performances that are comparable to those achieved by SDM.
However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge.
arXiv Detail & Related papers (2021-07-22T20:52:26Z) - SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical
Semantic Change [58.87961226278285]
This paper describes SChME, a method used in SemEval-2020 Task 1 on unsupervised detection of lexical semantic change.
SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature.
arXiv Detail & Related papers (2020-12-02T23:56:34Z) - Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z) - Assessing Phrasal Representation and Composition in Transformers [13.460125148455143]
Deep transformer models have pushed performance on NLP tasks to new limits.
We present systematic analysis of phrasal representations in state-of-the-art pre-trained transformers.
We find that phrase representation in these models relies heavily on word content, with little evidence of nuanced composition.
arXiv Detail & Related papers (2020-10-08T04:59:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.