Related papers: OxfordTVG-HIC: Can Machine Make Humorous Captions from Images?

OxfordTVG-HIC: Can Machine Make Humorous Captions from Images?

URL: http://arxiv.org/abs/2307.11636v1
Date: Fri, 21 Jul 2023 14:58:44 GMT
Title: OxfordTVG-HIC: Can Machine Make Humorous Captions from Images?
Authors: Runjia Li, Shuyang Sun, Mohamed Elhoseiny, Philip Torr
Abstract summary: We present OxfordTVG-HIC (Humorous Image Captions), a large-scale dataset for humour generation and understanding. OxfordTVG-HIC features a wide range of emotional and semantic diversity resulting in out-of-context examples. We show how OxfordTVG-HIC can be leveraged for evaluating the humour of a generated text.
Score: 27.899718595182172
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This paper presents OxfordTVG-HIC (Humorous Image Captions), a large-scale dataset for humour generation and understanding. Humour is an abstract, subjective, and context-dependent cognitive construct involving several cognitive factors, making it a challenging task to generate and interpret. Hence, humour generation and understanding can serve as a new task for evaluating the ability of deep-learning methods to process abstract and subjective information. Due to the scarcity of data, humour-related generation tasks such as captioning remain under-explored. To address this gap, OxfordTVG-HIC offers approximately 2.9M image-text pairs with humour scores to train a generalizable humour captioning model. Contrary to existing captioning datasets, OxfordTVG-HIC features a wide range of emotional and semantic diversity resulting in out-of-context examples that are particularly conducive to generating humour. Moreover, OxfordTVG-HIC is curated devoid of offensive content. We also show how OxfordTVG-HIC can be leveraged for evaluating the humour of a generated text. Through explainability analysis of the trained models, we identify the visual and linguistic cues influential for evoking humour prediction (and generation). We observe qualitatively that these cues are aligned with the benign violation theory of humour in cognitive psychology.

Related papers

Psychology-Driven Enhancement of Humour Translation [15.348888125504658]
We propose a psychology-inspired Humour Decomposition Mechanism (HDM) to imitate the ability of the human thought process.<n>Our method significantly improves the quality of humour translation, yielding average gains of 7.75% in humour, 2.81% in fluency, and 6.13% in coherence of the generated text.
arXiv Detail & Related papers (2025-07-12T11:44:41Z)
Learning Interpretable Representations Leads to Semantically Faithful EEG-to-Text Generation [52.51005875755718]
We focus on EEG-to-text decoding and address its hallucination issue through the lens of posterior collapse.<n>Acknowledging the underlying mismatch in information capacity between EEG and text, we reframe the decoding task as semantic summarization of core meanings.<n>Experiments on the public ZuCo dataset demonstrate that GLIM consistently generates fluent, EEG-grounded sentences.
arXiv Detail & Related papers (2025-05-21T05:29:55Z)
From Punchlines to Predictions: A Metric to Assess LLM Performance in Identifying Humor in Stand-Up Comedy [6.124881326867511]
In light of the widespread adoption of Large Language Models, the intersection of humor and AI has become no laughing matter. In this study, we assess the ability of models in accurately identifying humorous quotes from a stand-up comedy transcript. We propose a novel humor detection metric designed to evaluate LLMs amongst various prompts on their capability to extract humorous punchlines.
arXiv Detail & Related papers (2025-04-12T02:19:53Z)
Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content [0.0]
The Deceptive Humor dataset (DHD) is a novel resource for studying humor derived from fabricated claims and misinformation. DHD consists of humor-infused comments generated from false narratives, incorporating fabricated claims and manipulated information. The dataset spans multiple languages including English, Telugu, Hindi, Kannada, Tamil, and their code-mixed variants (Te-En, Hi-En, Ka-En, Ta-En)
arXiv Detail & Related papers (2025-03-20T10:58:02Z)
Can Pre-trained Language Models Understand Chinese Humor? [74.96509580592004]
This paper is the first work that systematically investigates the humor understanding ability of pre-trained language models (PLMs) We construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework. Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation.
arXiv Detail & Related papers (2024-07-04T18:13:38Z)
Is AI fun? HumorDB: a curated dataset and benchmark to investigate graphical humor [8.75275650545552]
HumorDB is an image-only dataset specifically designed to advance visual humor understanding. The dataset enables evaluation through binary classification, range regression, and pairwise comparison tasks. HumorDB shows potential as a valuable benchmark for powerful large multimodal models.
arXiv Detail & Related papers (2024-06-19T13:51:40Z)
Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models [27.936545041302377]
Large language models (LLMs) can generate synthetic data for humor detection via editing texts. We benchmark LLMs on an existing human dataset and show that current LLMs display an impressive ability to 'unfun' jokes. We extend our approach to a code-mixed English-Hindi humor dataset, where we find that GPT-4's synthetic data is highly rated by bilingual annotators.
arXiv Detail & Related papers (2024-02-23T02:58:12Z)
Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue [67.09698638709065]
We propose a novel sEntiment-enhanceD Graph-based multimodal sarcasm Explanation framework, named EDGE. In particular, we first propose a lexicon-guided utterance sentiment inference module, where a utterance sentiment refinement strategy is devised. We then develop a module named Joint Cross Attention-based Sentiment Inference (JCA-SI) by extending the multimodal sentiment analysis model JCA to derive the joint sentiment label for each video-audio clip.
arXiv Detail & Related papers (2024-02-06T03:14:46Z)
This joke is [MASK]: Recognizing Humor and Offense with Prompting [9.745213455946324]
Humor is a magnetic component in everyday human interactions and communications. We investigate the effectiveness of prompting, a new transfer learning paradigm for NLP, for humor recognition.
arXiv Detail & Related papers (2022-10-25T13:02:45Z)
ExPUNations: Augmenting Puns with Keywords and Explanations [88.58174386894913]
We augment an existing dataset of puns with detailed crowdsourced annotations of keywords. This is the first humor dataset with such extensive and fine-grained annotations specifically for puns. We propose two tasks: explanation generation to aid with pun classification and keyword-conditioned pun generation.
arXiv Detail & Related papers (2022-10-24T18:12:02Z)
Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results [84.37263300062597]
Humor is a substantial element of human social behavior, affect, and cognition. Current methods of humor detection have been exclusively based on staged data, making them inadequate for "real-world" applications. We contribute to addressing this deficiency by introducing the novel Passau-Spontaneous Football Coach Humor dataset, comprising about 11 hours of recordings.
arXiv Detail & Related papers (2022-09-28T17:36:47Z)
Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest [70.40189243067857]
Large neural networks can now generate jokes, but do they really "understand" humor? We challenge AI models with three tasks derived from the New Yorker Cartoon Caption Contest. We find that both types of models struggle at all three tasks.
arXiv Detail & Related papers (2022-09-13T20:54:00Z)
M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in Conversations [72.81164101048181]
We propose a dataset for Multimodal Multiparty Hindi Humor (M2H2) recognition in conversations containing 6,191 utterances from 13 episodes of a very popular TV series "Shrimaan Shrimati Phir Se" Each utterance is annotated with humor/non-humor labels and encompasses acoustic, visual, and textual modalities. The empirical results on M2H2 dataset demonstrate that multimodal information complements unimodal information for humor recognition.
arXiv Detail & Related papers (2021-08-03T02:54:09Z)
DeHumor: Visual Analytics for Decomposing Humor [36.300283476950796]
We develop DeHumor, a visual system for analyzing humorous behaviors in public speaking. To intuitively reveal the building blocks of each concrete example, DeHumor decomposes each humorous video into multimodal features. We show that DeHumor is able to highlight various building blocks of humor examples.
arXiv Detail & Related papers (2021-07-18T04:01:07Z)
Intrinsic Image Captioning Evaluation [53.51379676690971]
We propose a learning based metrics for image captioning, which we call Intrinsic Image Captioning Evaluation(I2CE) Experiment results show that our proposed method can keep robust performance and give more flexible scores to candidate captions when encountered with semantic similar expression or less aligned semantics.
arXiv Detail & Related papers (2020-12-14T08:36:05Z)
Federated Learning with Diversified Preference for Humor Recognition [40.89453484353102]
We propose the FedHumor approach to recognize humorous text contents in a personalized manner through federated learning (FL) Experiments demonstrate significant advantages of FedHumor in recognizing humor contents accurately for people with diverse humor preferences compared to 9 state-of-the-art humor recognition approaches.
arXiv Detail & Related papers (2020-12-03T03:24:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.