Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results
- URL: http://arxiv.org/abs/2209.14272v3
- Date: Mon, 8 Jul 2024 10:50:56 GMT
- Title: Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results
- Authors: Lukas Christ, Shahin Amiriparian, Alexander Kathan, Niklas Müller, Andreas König, Björn W. Schuller,
- Abstract summary: Humor is a substantial element of human social behavior, affect, and cognition.
Current methods of humor detection have been exclusively based on staged data, making them inadequate for "real-world" applications.
We contribute to addressing this deficiency by introducing the novel Passau-Spontaneous Football Coach Humor dataset, comprising about 11 hours of recordings.
- Score: 84.37263300062597
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Humor is a substantial element of human social behavior, affect, and cognition. Its automatic understanding can facilitate a more naturalistic human-AI interaction. Current methods of humor detection have been exclusively based on staged data, making them inadequate for "real-world" applications. We contribute to addressing this deficiency by introducing the novel Passau-Spontaneous Football Coach Humor (Passau-SFCH) dataset, comprising about 11 hours of recordings. The Passau-SFCH dataset is annotated for the presence of humor and its dimensions (sentiment and direction) as proposed in Martin's Humor Style Questionnaire. We conduct a series of experiments employing pretrained Transformers, convolutional neural networks, and expert-designed features. The performance of each modality (text, audio, video) for spontaneous humor recognition is analyzed and their complementarity is investigated. Our findings suggest that for the automatic analysis of humor and its sentiment, facial expressions are most promising, while humor direction can be best modeled via text-based features. Further, we experiment with different multimodal approaches to humor recognition, including decision-level fusion and MulT, a multimodal Transformer approach. In this context, we propose a novel multimodal architecture that yields the best overall results. Finally, we make our code publicly available at https://www.github.com/lc0197/passau-sfch. The Passau-SFCH dataset is available upon request.
Related papers
- Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation [70.52558242336988]
We focus on predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion.
In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation.
We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a multimodal transcript''
arXiv Detail & Related papers (2024-09-13T18:28:12Z) - Can Pre-trained Language Models Understand Chinese Humor? [74.96509580592004]
This paper is the first work that systematically investigates the humor understanding ability of pre-trained language models (PLMs)
We construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework.
Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation.
arXiv Detail & Related papers (2024-07-04T18:13:38Z) - The MuSe 2024 Multimodal Sentiment Analysis Challenge: Social Perception and Humor Recognition [64.5207572897806]
The Multimodal Sentiment Analysis Challenge (MuSe) 2024 addresses two contemporary multimodal affect and sentiment analysis problems.
In the Social Perception Sub-Challenge (MuSe-Perception), participants will predict 16 different social attributes of individuals.
The Cross-Cultural Humor Detection Sub-Challenge (MuSe-Humor) dataset expands upon the Passau Spontaneous Football Coach Humor dataset.
arXiv Detail & Related papers (2024-06-11T22:26:20Z) - Humor Mechanics: Advancing Humor Generation with Multistep Reasoning [11.525355831490828]
We develop a working prototype for humor generation using multi-step reasoning.
We compare our approach with human-created jokes, zero-shot GPT-4 generated humor, and other baselines.
Our findings demonstrate that the multi-step reasoning approach consistently improves the quality of generated humor.
arXiv Detail & Related papers (2024-05-12T13:00:14Z) - Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models [27.936545041302377]
Large language models (LLMs) can generate synthetic data for humor detection via editing texts.
We benchmark LLMs on an existing human dataset and show that current LLMs display an impressive ability to 'unfun' jokes.
We extend our approach to a code-mixed English-Hindi humor dataset, where we find that GPT-4's synthetic data is highly rated by bilingual annotators.
arXiv Detail & Related papers (2024-02-23T02:58:12Z) - ChatAnything: Facetime Chat with LLM-Enhanced Personas [87.76804680223003]
We propose the mixture of voices (MoV) and the mixture of diffusers (MoD) for diverse voice and appearance generation.
For MoV, we utilize the text-to-speech (TTS) algorithms with a variety of pre-defined tones.
MoD, we combine the recent popular text-to-image generation techniques and talking head algorithms to streamline the process of generating talking objects.
arXiv Detail & Related papers (2023-11-12T08:29:41Z) - MAGIC-TBR: Multiview Attention Fusion for Transformer-based Bodily
Behavior Recognition in Group Settings [9.185580170954802]
This paper proposes a multiview attention fusion method named MAGIC-TBR that combines features extracted from videos and their corresponding Discrete Cosine Transform coefficients via a transformer-based approach.
Experiments are conducted on the BBSI dataset and the results demonstrate the effectiveness of the proposed feature fusion with multiview attention.
arXiv Detail & Related papers (2023-09-19T17:04:36Z) - Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion [89.01668641930206]
We present a framework for modeling interactional communication in dyadic conversations.
We autoregressively output multiple possibilities of corresponding listener motion.
Our method organically captures the multimodal and non-deterministic nature of nonverbal dyadic interactions.
arXiv Detail & Related papers (2022-04-18T17:58:04Z) - Laughing Heads: Can Transformers Detect What Makes a Sentence Funny? [18.67834526946997]
We train and analyze transformer-based humor recognition models on a dataset consisting of minimal pairs of aligned sentences.
We find that, although our aligned dataset is much harder than previous datasets, transformer-based models recognize the humorous sentence in an aligned pair with high accuracy (78%).
Most remarkably, we find clear evidence that one single attention head learns to recognize the words that make a test sentence humorous, even without access to this information at training time.
arXiv Detail & Related papers (2021-05-19T14:02:25Z) - ColBERT: Using BERT Sentence Embedding in Parallel Neural Networks for
Computational Humor [0.0]
We propose a novel approach for detecting and rating humor in short texts based on a popular linguistic theory of humor.
The proposed technical method initiates by separating sentences of the given text and utilizing the BERT model to generate embeddings for each one.
We accompany the paper with a novel dataset for humor detection consisting of 200,000 formal short texts.
The proposed model obtained F1 scores of 0.982 and 0.869 in the humor detection experiments which outperform general and state-of-the-art models.
arXiv Detail & Related papers (2020-04-27T13:10:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.