CMDAG: A Chinese Metaphor Dataset with Annotated Grounds as CoT for
Boosting Metaphor Generation
- URL: http://arxiv.org/abs/2402.13145v2
- Date: Wed, 21 Feb 2024 03:18:04 GMT
- Title: CMDAG: A Chinese Metaphor Dataset with Annotated Grounds as CoT for
Boosting Metaphor Generation
- Authors: Yujie Shao, Xinrong Yao, Xingwei Qu, Chenghua Lin, Shi Wang, Stephen
W. Huang, Ge Zhang, Jie Fu
- Abstract summary: This paper introduces a large-scale high quality annotated Chinese Metaphor Corpus, which comprises around 28K sentences.
To ensure the accuracy and consistency of our annotations, we introduce a comprehensive set of guidelines.
Breaking tradition, our approach to metaphor generation emphasizes grounds and their distinct features rather than the conventional combination of tenors and vehicles.
- Score: 35.14142183519002
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Metaphor is a prominent linguistic device in human language and literature,
as they add color, imagery, and emphasis to enhance effective communication.
This paper introduces a large-scale high quality annotated Chinese Metaphor
Corpus, which comprises around 28K sentences drawn from a diverse range of
Chinese literary sources, such as poems, prose, song lyrics, etc. To ensure the
accuracy and consistency of our annotations, we introduce a comprehensive set
of guidelines. These guidelines address the facets of metaphor annotation,
including identifying tenors, vehicles, and grounds to handling the
complexities of similes, personifications, juxtapositions, and hyperboles.
Breaking tradition, our approach to metaphor generation emphasizes grounds and
their distinct features rather than the conventional combination of tenors and
vehicles. By integrating "ground" as a CoT (Chain of Thoughts) input, we are
able to generate metaphors that resonate more with real-world intuition. We
test generative models such as Belle, Baichuan, and Chinese-alpaca-33B using
our annotated corpus. These models are able to generate creative and fluent
metaphor sentences more frequently induced by selected samples from our
dataset, demonstrating the value of our corpus for Chinese metaphor research.
The code is available in
https://github.com/JasonShao55/Chinese_Metaphor_Explanation.
Related papers
- Compositional Entailment Learning for Hyperbolic Vision-Language Models [54.41927525264365]
We show how to fully leverage the innate hierarchical nature of hyperbolic embeddings by looking beyond individual image-text pairs.
We propose Compositional Entailment Learning for hyperbolic vision-language models.
Empirical evaluation on a hyperbolic vision-language model trained with millions of image-text pairs shows that the proposed compositional learning approach outperforms conventional Euclidean CLIP learning.
arXiv Detail & Related papers (2024-10-09T14:12:50Z) - A Perspective on Literary Metaphor in the Context of Generative AI [0.6445605125467572]
This study explores the role of literary metaphor and its capacity to generate a range of meanings.
To investigate whether the inclusion of original figurative language improves textual quality, we trained an LSTM-based language model in Afrikaans.
The paper raises thought-provoking questions on aesthetic value, interpretation and evaluation.
arXiv Detail & Related papers (2024-09-02T08:27:29Z) - A framework for annotating and modelling intentions behind metaphor use [12.40493670580608]
We propose a novel taxonomy of intentions commonly attributed to metaphor, which comprises 9 categories.
We also release the first dataset annotated for intentions behind metaphor use.
We use this dataset to test the capability of large language models (LLMs) in inferring the intentions behind metaphor use, in zero- and in-context few-shot settings.
arXiv Detail & Related papers (2024-07-04T14:13:57Z) - Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation [6.0158981171030685]
We present a novel parallel dataset for the tasks of metaphor detection and interpretation that contains metaphor annotations in both Spanish and English.
We investigate language models' metaphor identification and understanding abilities through a series of monolingual and cross-lingual experiments.
arXiv Detail & Related papers (2024-04-10T14:44:48Z) - Leveraging a New Spanish Corpus for Multilingual and Crosslingual
Metaphor Detection [5.9647924003148365]
This work presents the first corpus annotated with naturally occurring metaphors in Spanish large enough to develop systems to perform metaphor detection.
The presented dataset, CoMeta, includes texts from various domains, namely, news, political discourse, Wikipedia and reviews.
arXiv Detail & Related papers (2022-10-19T07:55:36Z) - CCPM: A Chinese Classical Poetry Matching Dataset [50.90794811956129]
We propose a novel task to assess a model's semantic understanding of poetry by poem matching.
This task requires the model to select one line of Chinese classical poetry among four candidates according to the modern Chinese translation of a line of poetry.
To construct this dataset, we first obtain a set of parallel data of Chinese classical poetry and modern Chinese translation.
arXiv Detail & Related papers (2021-06-03T16:49:03Z) - Metaphor Generation with Conceptual Mappings [58.61307123799594]
We aim to generate a metaphoric sentence given a literal expression by replacing relevant verbs.
We propose to control the generation process by encoding conceptual mappings between cognitive domains.
We show that the unsupervised CM-Lex model is competitive with recent deep learning metaphor generation systems.
arXiv Detail & Related papers (2021-06-02T15:27:05Z) - MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding [22.756157298168127]
Based on a theoretically-grounded connection between metaphors and symbols, we propose a method to automatically construct a parallel corpus.
For the generation task, we incorporate a metaphor discriminator to guide the decoding of a sequence to sequence model fine-tuned on our parallel data.
A task-based evaluation shows that human-written poems enhanced with metaphors are preferred 68% of the time compared to poems without metaphors.
arXiv Detail & Related papers (2021-03-11T16:39:19Z) - Generating similes effortlessly like a Pro: A Style Transfer Approach
for Simile Generation [65.22565071742528]
Figurative language such as a simile go beyond plain expressions to give readers new insights and inspirations.
Generating a simile requires proper understanding for effective mapping of properties between two concepts.
We show how replacing literal sentences with similes from our best model in machine generated stories improves evocativeness and leads to better acceptance by human judges.
arXiv Detail & Related papers (2020-09-18T17:37:13Z) - Metaphoric Paraphrase Generation [58.592750281138265]
We use crowdsourcing to evaluate our results, as well as developing an automatic metric for evaluating metaphoric paraphrases.
We show that while the lexical replacement baseline is capable of producing accurate paraphrases, they often lack metaphoricity.
Our metaphor masking model excels in generating metaphoric sentences while performing nearly as well with regard to fluency and paraphrase quality.
arXiv Detail & Related papers (2020-02-28T16:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.