Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation
- URL: http://arxiv.org/abs/2404.07053v1
- Date: Wed, 10 Apr 2024 14:44:48 GMT
- Title: Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation
- Authors: Elisa Sanchez-Bayona, Rodrigo Agerri,
- Abstract summary: We present a novel parallel dataset for the tasks of metaphor detection and interpretation that contains metaphor annotations in both Spanish and English.
We investigate language models' metaphor identification and understanding abilities through a series of monolingual and cross-lingual experiments.
- Score: 6.0158981171030685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Metaphors, although occasionally unperceived, are ubiquitous in our everyday language. Thus, it is crucial for Language Models to be able to grasp the underlying meaning of this kind of figurative language. In this work, we present Meta4XNLI, a novel parallel dataset for the tasks of metaphor detection and interpretation that contains metaphor annotations in both Spanish and English. We investigate language models' metaphor identification and understanding abilities through a series of monolingual and cross-lingual experiments by leveraging our proposed corpus. In order to comprehend how these non-literal expressions affect models' performance, we look over the results and perform an error analysis. Additionally, parallel data offers many potential opportunities to investigate metaphor transferability between these languages and the impact of translation on the development of multilingual annotated resources.
Related papers
- Multi-lingual and Multi-cultural Figurative Language Understanding [69.47641938200817]
Figurative language permeates human communication, but is relatively understudied in NLP.
We create a dataset for seven diverse languages associated with a variety of cultures: Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili and Yoruba.
Our dataset reveals that each language relies on cultural and regional concepts for figurative expressions, with the highest overlap between languages originating from the same region.
All languages exhibit a significant deficiency compared to English, with variations in performance reflecting the availability of pre-training and fine-tuning data.
arXiv Detail & Related papers (2023-05-25T15:30:31Z) - LMs stand their Ground: Investigating the Effect of Embodiment in
Figurative Language Interpretation by Language Models [0.0]
Figurative language is a challenge for language models since its interpretation deviates from their conventional order and meaning.
Yet, humans can easily understand and interpret metaphors as they can be derived from embodied metaphors.
This study shows how larger language models perform better at interpreting metaphoric sentences when the action of the metaphorical sentence is more embodied.
arXiv Detail & Related papers (2023-05-05T11:44:12Z) - Leveraging a New Spanish Corpus for Multilingual and Crosslingual
Metaphor Detection [5.9647924003148365]
This work presents the first corpus annotated with naturally occurring metaphors in Spanish large enough to develop systems to perform metaphor detection.
The presented dataset, CoMeta, includes texts from various domains, namely, news, political discourse, Wikipedia and reviews.
arXiv Detail & Related papers (2022-10-19T07:55:36Z) - Testing the Ability of Language Models to Interpret Figurative Language [69.59943454934799]
Figurative and metaphorical language are commonplace in discourse.
It remains an open question to what extent modern language models can interpret nonliteral phrases.
We introduce Fig-QA, a Winograd-style nonliteral language understanding task.
arXiv Detail & Related papers (2022-04-26T23:42:22Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - On the Impact of Temporal Representations on Metaphor Detection [1.6959319157216468]
State-of-the-art approaches for metaphor detection compare their literal - or core - meaning and their contextual meaning using sequential metaphor classifiers based on neural networks.
This study examines the metaphor detection task with a detailed exploratory analysis where different temporal and static word embeddings are used to account for different representations of literal meanings.
Results suggest that different word embeddings do impact on the metaphor detection task and some temporal word embeddings slightly outperform static methods on some performance measures.
arXiv Detail & Related papers (2021-11-05T08:43:21Z) - It's not Rocket Science : Interpreting Figurative Language in Narratives [48.84507467131819]
We study the interpretation of two non-compositional figurative languages (idioms and similes)
Our experiments show that models based solely on pre-trained language models perform substantially worse than humans on these tasks.
We additionally propose knowledge-enhanced models, adopting human strategies for interpreting figurative language.
arXiv Detail & Related papers (2021-08-31T21:46:35Z) - Interpreting Verbal Metaphors by Paraphrasing [12.750941606061877]
We show that our paraphrasing method significantly outperforms the state-of-the-art baseline.
We also demonstrate that our method can help a machine translation system improve its accuracy in translating English metaphors to 8 target languages.
arXiv Detail & Related papers (2021-04-07T21:00:23Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.