Metaphors in Pre-Trained Language Models: Probing and Generalization
Across Datasets and Languages
- URL: http://arxiv.org/abs/2203.14139v1
- Date: Sat, 26 Mar 2022 19:05:24 GMT
- Title: Metaphors in Pre-Trained Language Models: Probing and Generalization
Across Datasets and Languages
- Authors: Ehsan Aghazadeh, Mohsen Fayyaz, Yadollah Yaghoobzadeh
- Abstract summary: Large pre-trained language models (PLMs) are assumed to encode metaphorical knowledge useful for NLP systems.
We present studies in multiple metaphor detection datasets and in four languages.
Our experiments suggest that contextual representations in PLMs do encode metaphorical knowledge, and mostly in their middle layers.
- Score: 6.7126373378083715
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human languages are full of metaphorical expressions. Metaphors help people
understand the world by connecting new concepts and domains to more familiar
ones. Large pre-trained language models (PLMs) are therefore assumed to encode
metaphorical knowledge useful for NLP systems. In this paper, we investigate
this hypothesis for PLMs, by probing metaphoricity information in their
encodings, and by measuring the cross-lingual and cross-dataset generalization
of this information. We present studies in multiple metaphor detection datasets
and in four languages (i.e., English, Spanish, Russian, and Farsi). Our
extensive experiments suggest that contextual representations in PLMs do encode
metaphorical knowledge, and mostly in their middle layers. The knowledge is
transferable between languages and datasets, especially when the annotation is
consistent across training and testing sets. Our findings give helpful insights
for both cognitive and NLP scientists.
Related papers
- A framework for annotating and modelling intentions behind metaphor use [12.40493670580608]
We propose a novel taxonomy of intentions commonly attributed to metaphor, which comprises 9 categories.
We also release the first dataset annotated for intentions behind metaphor use.
We use this dataset to test the capability of large language models (LLMs) in inferring the intentions behind metaphor use, in zero- and in-context few-shot settings.
arXiv Detail & Related papers (2024-07-04T14:13:57Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation [6.0158981171030685]
We present a novel parallel dataset for the tasks of metaphor detection and interpretation that contains metaphor annotations in both Spanish and English.
We investigate language models' metaphor identification and understanding abilities through a series of monolingual and cross-lingual experiments.
arXiv Detail & Related papers (2024-04-10T14:44:48Z) - Multi-lingual and Multi-cultural Figurative Language Understanding [69.47641938200817]
Figurative language permeates human communication, but is relatively understudied in NLP.
We create a dataset for seven diverse languages associated with a variety of cultures: Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili and Yoruba.
Our dataset reveals that each language relies on cultural and regional concepts for figurative expressions, with the highest overlap between languages originating from the same region.
All languages exhibit a significant deficiency compared to English, with variations in performance reflecting the availability of pre-training and fine-tuning data.
arXiv Detail & Related papers (2023-05-25T15:30:31Z) - Translate to Disambiguate: Zero-shot Multilingual Word Sense
Disambiguation with Pretrained Language Models [67.19567060894563]
Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks.
We present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT)
We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance.
arXiv Detail & Related papers (2023-04-26T19:55:52Z) - SocioProbe: What, When, and Where Language Models Learn about
Sociodemographics [31.040600510190732]
We investigate the sociodemographic knowledge of pre-trained language models (PLMs) on multiple English data sets.
Our results show that PLMs do encode these sociodemographics, and that this knowledge is sometimes spread across the layers of some of the tested PLMs.
Our overall results indicate that sociodemographic knowledge is still a major challenge for NLP.
arXiv Detail & Related papers (2022-11-08T14:37:45Z) - Leveraging a New Spanish Corpus for Multilingual and Crosslingual
Metaphor Detection [5.9647924003148365]
This work presents the first corpus annotated with naturally occurring metaphors in Spanish large enough to develop systems to perform metaphor detection.
The presented dataset, CoMeta, includes texts from various domains, namely, news, political discourse, Wikipedia and reviews.
arXiv Detail & Related papers (2022-10-19T07:55:36Z) - Locating Language-Specific Information in Contextualized Embeddings [2.836066255205732]
Multilingual pretrained language models (MPLMs) exhibit multilinguality and are well suited for transfer across languages.
The question whether MPLM representations are language-agnostic or they simply interleave well with learned task prediction heads arises.
We locate language-specific information in MPLMs and identify its dimensionality and the layers where this information occurs.
arXiv Detail & Related papers (2021-09-16T15:11:55Z) - Probing Pretrained Language Models for Lexical Semantics [76.73599166020307]
We present a systematic empirical analysis across six typologically diverse languages and five different lexical tasks.
Our results indicate patterns and best practices that hold universally, but also point to prominent variations across languages and tasks.
arXiv Detail & Related papers (2020-10-12T14:24:01Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.