Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models
- URL: http://arxiv.org/abs/2403.00794v2
- Date: Fri, 21 Jun 2024 17:12:35 GMT
- Title: Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models
- Authors: Zachary Horvitz, Jingru Chen, Rahul Aditya, Harshvardhan Srivastava, Robert West, Zhou Yu, Kathleen McKeown,
- Abstract summary: Large language models (LLMs) can generate synthetic data for humor detection via editing texts.
We benchmark LLMs on an existing human dataset and show that current LLMs display an impressive ability to 'unfun' jokes.
We extend our approach to a code-mixed English-Hindi humor dataset, where we find that GPT-4's synthetic data is highly rated by bilingual annotators.
- Score: 27.936545041302377
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humor is a fundamental facet of human cognition and interaction. Yet, despite recent advances in natural language processing, humor detection remains a challenging task that is complicated by the scarcity of datasets that pair humorous texts with similar non-humorous counterparts. In our work, we investigate whether large language models (LLMs), can generate synthetic data for humor detection via editing texts. We benchmark LLMs on an existing human dataset and show that current LLMs display an impressive ability to 'unfun' jokes, as judged by humans and as measured on the downstream task of humor detection. We extend our approach to a code-mixed English-Hindi humor dataset, where we find that GPT-4's synthetic data is highly rated by bilingual annotators and provides challenging adversarial examples for humor classifiers.
Related papers
- Can Pre-trained Language Models Understand Chinese Humor? [74.96509580592004]
This paper is the first work that systematically investigates the humor understanding ability of pre-trained language models (PLMs)
We construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework.
Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation.
arXiv Detail & Related papers (2024-07-04T18:13:38Z) - Is AI fun? HumorDB: a curated dataset and benchmark to investigate graphical humor [8.75275650545552]
HumorDB is an image-only dataset specifically designed to advance visual humor understanding.
The dataset enables evaluation through binary classification, range regression, and pairwise comparison tasks.
HumorDB shows potential as a valuable benchmark for powerful large multimodal models.
arXiv Detail & Related papers (2024-06-19T13:51:40Z) - Chumor 1.0: A Truly Funny and Challenging Chinese Humor Understanding Dataset from Ruo Zhi Ba [7.878358092927338]
We construct Chumor, a dataset sourced from Ruo Zhi Ba (RZB), a Chinese Reddit-like platform dedicated to sharing intellectually challenging and culturally specific jokes.
We annotate explanations for each joke and evaluate human explanations against two state-of-the-art LLMs, GPT-4o and ERNIE Bot.
Our evaluation shows that Chumor is challenging even for SOTA LLMs, and the human explanations for Chumor jokes are significantly better than explanations generated by the LLMs.
arXiv Detail & Related papers (2024-06-18T16:22:05Z) - Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like [49.2096391012794]
ELaTE is a zero-shot TTS that can generate natural laughing speech of any speaker based on a short audio prompt.
We develop our model based on the foundation of conditional flow-matching-based zero-shot TTS.
We show that ELaTE can generate laughing speech with significantly higher quality and controllability compared to conventional models.
arXiv Detail & Related papers (2024-02-12T02:58:10Z) - OxfordTVG-HIC: Can Machine Make Humorous Captions from Images? [27.899718595182172]
We present OxfordTVG-HIC (Humorous Image Captions), a large-scale dataset for humour generation and understanding.
OxfordTVG-HIC features a wide range of emotional and semantic diversity resulting in out-of-context examples.
We show how OxfordTVG-HIC can be leveraged for evaluating the humour of a generated text.
arXiv Detail & Related papers (2023-07-21T14:58:44Z) - The Naughtyformer: A Transformer Understands Offensive Humor [63.05016513788047]
We introduce a novel jokes dataset filtered from Reddit and solve the subtype classification task using a finetuned Transformer dubbed the Naughtyformer.
We show that our model is significantly better at detecting offensiveness in jokes compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-11-25T20:37:58Z) - ExPUNations: Augmenting Puns with Keywords and Explanations [88.58174386894913]
We augment an existing dataset of puns with detailed crowdsourced annotations of keywords.
This is the first humor dataset with such extensive and fine-grained annotations specifically for puns.
We propose two tasks: explanation generation to aid with pun classification and keyword-conditioned pun generation.
arXiv Detail & Related papers (2022-10-24T18:12:02Z) - Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results [84.37263300062597]
Humor is a substantial element of human social behavior, affect, and cognition.
Current methods of humor detection have been exclusively based on staged data, making them inadequate for "real-world" applications.
We contribute to addressing this deficiency by introducing the novel Passau-Spontaneous Football Coach Humor dataset, comprising about 11 hours of recordings.
arXiv Detail & Related papers (2022-09-28T17:36:47Z) - M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in
Conversations [72.81164101048181]
We propose a dataset for Multimodal Multiparty Hindi Humor (M2H2) recognition in conversations containing 6,191 utterances from 13 episodes of a very popular TV series "Shrimaan Shrimati Phir Se"
Each utterance is annotated with humor/non-humor labels and encompasses acoustic, visual, and textual modalities.
The empirical results on M2H2 dataset demonstrate that multimodal information complements unimodal information for humor recognition.
arXiv Detail & Related papers (2021-08-03T02:54:09Z) - Dutch Humor Detection by Generating Negative Examples [5.888646114353371]
Humor detection is usually modeled as a binary classification task, trained to predict if the given text is a joke or another type of text.
We propose using text generation algorithms for imitating the original joke dataset to increase the difficulty for the learning algorithm.
We compare the humor detection capabilities of classic neural network approaches with the state-of-the-art Dutch language model RobBERT.
arXiv Detail & Related papers (2020-10-26T15:15:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.