Multi-modal Sarcasm Detection and Humor Classification in Code-mixed
Conversations
- URL: http://arxiv.org/abs/2105.09984v1
- Date: Thu, 20 May 2021 18:33:55 GMT
- Title: Multi-modal Sarcasm Detection and Humor Classification in Code-mixed
Conversations
- Authors: Manjot Bedi, Shivani Kumar, Md Shad Akhtar, and Tanmoy Chakraborty
- Abstract summary: We develop a Hindi-English code-mixed dataset, MaSaC, for the multi-modal sarcasm detection and humor classification in conversational dialog.
We propose MSH-COMICS, a novel attention-rich neural architecture for the utterance classification.
- Score: 14.852199996061287
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sarcasm detection and humor classification are inherently subtle problems,
primarily due to their dependence on the contextual and non-verbal information.
Furthermore, existing studies in these two topics are usually constrained in
non-English languages such as Hindi, due to the unavailability of qualitative
annotated datasets. In this work, we make two major contributions considering
the above limitations: (1) we develop a Hindi-English code-mixed dataset,
MaSaC, for the multi-modal sarcasm detection and humor classification in
conversational dialog, which to our knowledge is the first dataset of its kind;
(2) we propose MSH-COMICS, a novel attention-rich neural architecture for the
utterance classification. We learn efficient utterance representation utilizing
a hierarchical attention mechanism that attends to a small portion of the input
sentence at a time. Further, we incorporate dialog-level contextual attention
mechanism to leverage the dialog history for the multi-modal classification. We
perform extensive experiments for both the tasks by varying multi-modal inputs
and various submodules of MSH-COMICS. We also conduct comparative analysis
against existing approaches. We observe that MSH-COMICS attains superior
performance over the existing models by > 1 F1-score point for the sarcasm
detection and 10 F1-score points in humor classification. We diagnose our model
and perform thorough analysis of the results to understand the superiority and
pitfalls.
Related papers
- CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in Essays [30.728539221991188]
Existing rhetorical datasets or corpora primarily focus on single coarse-grained categories or fine-grained categories.
We propose the Chinese Essay Rhetoric dataset (CERD), consisting of 4 commonly used coarse-grained categories.
CERD is a manually annotated and comprehensive Chinese rhetoric dataset with five interrelated sub-tasks.
arXiv Detail & Related papers (2024-09-29T12:47:25Z) - VyAnG-Net: A Novel Multi-Modal Sarcasm Recognition Model by Uncovering Visual, Acoustic and Glossary Features [13.922091192207718]
Sarcasm recognition aims to identify hidden sarcastic, criticizing, and metaphorical information embedded in everyday dialogue.
We propose a novel approach that combines a lightweight depth attention module with a self-regulated ConvNet to concentrate on the most crucial features of visual data.
We have also conducted a cross-dataset analysis to test the adaptability of VyAnG-Net with unseen samples of another dataset MUStARD++.
arXiv Detail & Related papers (2024-08-05T15:36:52Z) - MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific Understanding [59.41495657570397]
This dataset includes figures such as schematic diagrams, simulated images, macroscopic/microscopic photos, and experimental visualizations.
We developed benchmarks for scientific figure captioning and multiple-choice questions, evaluating six proprietary and over ten open-source models.
The dataset and benchmarks will be released to support further research.
arXiv Detail & Related papers (2024-07-06T00:40:53Z) - Multi-turn Dialogue Comprehension from a Topic-aware Perspective [70.37126956655985]
This paper proposes to model multi-turn dialogues from a topic-aware perspective.
We use a dialogue segmentation algorithm to split a dialogue passage into topic-concentrated fragments in an unsupervised way.
We also present a novel model, Topic-Aware Dual-Attention Matching (TADAM) Network, which takes topic segments as processing elements.
arXiv Detail & Related papers (2023-09-18T11:03:55Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - Explaining (Sarcastic) Utterances to Enhance Affect Understanding in
Multimodal Dialogues [40.80696210030204]
We propose MOSES, a deep neural network, which takes a multimodal (sarcastic) dialogue instance as an input and generates a natural language sentence as its explanation.
We leverage the generated explanation for various natural language understanding tasks in a conversational dialogue setup, such as sarcasm detection, humour identification, and emotion recognition.
Our evaluation shows that MOSES outperforms the state-of-the-art system for SED by an average of 2% on different evaluation metrics.
arXiv Detail & Related papers (2022-11-20T18:05:43Z) - When did you become so smart, oh wise one?! Sarcasm Explanation in
Multi-modal Multi-party Dialogues [27.884015521888458]
We study the discourse structure of sarcastic conversations and propose a novel task - Sarcasm Explanation in Dialogue (SED)
SED aims to generate natural language explanations of satirical conversations.
We propose MAF, a multimodal context-aware attention and global information fusion module to capture multimodality and use it to benchmark WITS.
arXiv Detail & Related papers (2022-03-12T12:16:07Z) - M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in
Conversations [72.81164101048181]
We propose a dataset for Multimodal Multiparty Hindi Humor (M2H2) recognition in conversations containing 6,191 utterances from 13 episodes of a very popular TV series "Shrimaan Shrimati Phir Se"
Each utterance is annotated with humor/non-humor labels and encompasses acoustic, visual, and textual modalities.
The empirical results on M2H2 dataset demonstrate that multimodal information complements unimodal information for humor recognition.
arXiv Detail & Related papers (2021-08-03T02:54:09Z) - Interpretable Multi-Head Self-Attention model for Sarcasm Detection in
social media [0.0]
Inherent ambiguity in sarcastic expressions, make sarcasm detection very difficult.
We develop an interpretable deep learning model using multi-head self-attention and gated recurrent units.
We show the effectiveness of our approach by achieving state-of-the-art results on multiple datasets.
arXiv Detail & Related papers (2021-01-14T21:39:35Z) - Detecting and Classifying Malevolent Dialogue Responses: Taxonomy, Data
and Methodology [68.8836704199096]
Corpus-based conversational interfaces are able to generate more diverse and natural responses than template-based or retrieval-based agents.
With their increased generative capacity of corpusbased conversational agents comes the need to classify and filter out malevolent responses.
Previous studies on the topic of recognizing and classifying inappropriate content are mostly focused on a certain category of malevolence.
arXiv Detail & Related papers (2020-08-21T22:43:27Z) - Predicting the Humorousness of Tweets Using Gaussian Process Preference
Learning [56.18809963342249]
We present a probabilistic approach that learns to rank and rate the humorousness of short texts by exploiting human preference judgments and automatically sourced linguistic annotations.
We report system performance for the campaign's two subtasks, humour detection and funniness score prediction, and discuss some issues arising from the conversion between the numeric scores used in the HAHA@IberLEF 2019 data and the pairwise judgment annotations required for our method.
arXiv Detail & Related papers (2020-08-03T13:05:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.