Exploring Chinese Humor Generation: A Study on Two-Part Allegorical Sayings
- URL: http://arxiv.org/abs/2403.10781v1
- Date: Sat, 16 Mar 2024 02:58:57 GMT
- Title: Exploring Chinese Humor Generation: A Study on Two-Part Allegorical Sayings
- Authors: Rongwu Xu,
- Abstract summary: This paper investigates the capability of state-of-the-art language models to comprehend and generate Chinese humor.
We employ two prominent training methods: fine-tuning a medium-sized language model and prompting a large one.
Human-annotated results show that these models can generate humorous allegorical sayings, with prompting proving to be a practical and effective method.
- Score: 0.76146285961466
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Humor, a culturally nuanced aspect of human language, poses challenges for computational understanding and generation, especially in Chinese humor, which remains relatively unexplored in the NLP community. This paper investigates the capability of state-of-the-art language models to comprehend and generate Chinese humor, specifically focusing on training them to create allegorical sayings. We employ two prominent training methods: fine-tuning a medium-sized language model and prompting a large one. Our novel fine-tuning approach incorporates fused Pinyin embeddings to consider homophones and employs contrastive learning with synthetic hard negatives to distinguish humor elements. Human-annotated results show that these models can generate humorous allegorical sayings, with prompting proving to be a practical and effective method. However, there is still room for improvement in generating allegorical sayings that match human creativity.
Related papers
- Exploring Automated Keyword Mnemonics Generation with Large Language Models via Overgenerate-and-Rank [4.383205675898942]
Keywords mnemonics are a technique for memorizing vocabulary through memorable associations with a target word via a verbal cue.
We propose a novel overgenerate-and-rank method via prompting large language models to generate verbal cues.
Results show that LLM-generated mnemonics are comparable to human-generated ones in terms of imageability, coherence, and perceived usefulness.
arXiv Detail & Related papers (2024-09-21T00:00:18Z) - Can Pre-trained Language Models Understand Chinese Humor? [74.96509580592004]
This paper is the first work that systematically investigates the humor understanding ability of pre-trained language models (PLMs)
We construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework.
Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation.
arXiv Detail & Related papers (2024-07-04T18:13:38Z) - Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions [16.23585043442914]
This paper focuses on comics with contradictory narratives, where each comic consists of two panels that create a humorous contradiction.
We introduce the YesBut benchmark, which comprises tasks of varying difficulty aimed at assessing AI's capabilities in recognizing and interpreting these comics.
Our results show that even state-of-the-art models still lag behind human performance on this task.
arXiv Detail & Related papers (2024-05-29T13:51:43Z) - BabySLM: language-acquisition-friendly benchmark of self-supervised
spoken language models [56.93604813379634]
Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels.
We propose a language-acquisition-friendly benchmark to probe spoken language models at the lexical and syntactic levels.
We highlight two exciting challenges that need to be addressed for further progress: bridging the gap between text and speech and between clean speech and in-the-wild speech.
arXiv Detail & Related papers (2023-06-02T12:54:38Z) - Computational Language Acquisition with Theory of Mind [84.2267302901888]
We build language-learning agents equipped with Theory of Mind (ToM) and measure its effects on the learning process.
We find that training speakers with a highly weighted ToM listener component leads to performance gains in our image referential game setting.
arXiv Detail & Related papers (2023-03-02T18:59:46Z) - A fine-grained comparison of pragmatic language understanding in humans
and language models [2.231167375820083]
We compare language models and humans on seven pragmatic phenomena.
We find that the largest models achieve high accuracy and match human error patterns.
Preliminary evidence that models and humans are sensitive to similar linguistic cues.
arXiv Detail & Related papers (2022-12-13T18:34:59Z) - Testing the Ability of Language Models to Interpret Figurative Language [69.59943454934799]
Figurative and metaphorical language are commonplace in discourse.
It remains an open question to what extent modern language models can interpret nonliteral phrases.
We introduce Fig-QA, a Winograd-style nonliteral language understanding task.
arXiv Detail & Related papers (2022-04-26T23:42:22Z) - Linking Emergent and Natural Languages via Corpus Transfer [98.98724497178247]
We propose a novel way to establish a link by corpus transfer between emergent languages and natural languages.
Our approach showcases non-trivial transfer benefits for two different tasks -- language modeling and image captioning.
We also introduce a novel metric to predict the transferability of an emergent language by translating emergent messages to natural language captions grounded on the same images.
arXiv Detail & Related papers (2022-03-24T21:24:54Z) - It's not Rocket Science : Interpreting Figurative Language in Narratives [48.84507467131819]
We study the interpretation of two non-compositional figurative languages (idioms and similes)
Our experiments show that models based solely on pre-trained language models perform substantially worse than humans on these tasks.
We additionally propose knowledge-enhanced models, adopting human strategies for interpreting figurative language.
arXiv Detail & Related papers (2021-08-31T21:46:35Z) - Estimating Subjective Crowd-Evaluations as an Additional Objective to
Improve Natural Language Generation [0.0]
We use a crowd-authored dialogue corpus to fine-tune six different language generation models.
Two of these models incorporate multi-task learning and use subjective ratings of lines as part of an explicit learning goal.
A human evaluation of the generated dialogue lines reveals that utterances generated by the multi-tasking models were subjectively rated as the most typical, most moving the conversation forward, and least offensive.
arXiv Detail & Related papers (2021-04-12T06:33:16Z) - Advancing Humor-Focused Sentiment Analysis through Improved
Contextualized Embeddings and Model Architecture [0.0]
Humor allows us to express thoughts and feelings conveniently and effectively.
As language models become ubiquitous through virtual-assistants and IOT devices, the need to develop humor-aware models rises exponentially.
arXiv Detail & Related papers (2020-11-23T22:30:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.