Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity
Modeling with Knowledge Enhancement
- URL: http://arxiv.org/abs/2210.03501v1
- Date: Fri, 7 Oct 2022 12:44:33 GMT
- Title: Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity
Modeling with Knowledge Enhancement
- Authors: Hui Liu, Wenya Wang, Haoliang Li
- Abstract summary: Sarcasm is a linguistic phenomenon indicating a discrepancy between literal meanings and implied intentions.
Most existing techniques only modeled the atomic-level inconsistencies between the text input and its accompanying image.
We propose a novel hierarchical framework for sarcasm detection by exploring both the atomic-level congruity based on multi-head cross attention mechanism and the composition-level congruity based on graph neural networks.
- Score: 31.97249246223621
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sarcasm is a linguistic phenomenon indicating a discrepancy between literal
meanings and implied intentions. Due to its sophisticated nature, it is usually
challenging to be detected from the text itself. As a result, multi-modal
sarcasm detection has received more attention in both academia and industries.
However, most existing techniques only modeled the atomic-level inconsistencies
between the text input and its accompanying image, ignoring more complex
compositions for both modalities. Moreover, they neglected the rich information
contained in external knowledge, e.g., image captions. In this paper, we
propose a novel hierarchical framework for sarcasm detection by exploring both
the atomic-level congruity based on multi-head cross attention mechanism and
the composition-level congruity based on graph neural networks, where a post
with low congruity can be identified as sarcasm. In addition, we exploit the
effect of various knowledge resources for sarcasm detection. Evaluation results
on a public multi-modal sarcasm detection dataset based on Twitter demonstrate
the superiority of our proposed model.
Related papers
- A Survey of Multimodal Sarcasm Detection [32.659528422756416]
Sarcasm is a rhetorical device that is used to convey the opposite of the literal meaning of an utterance.
We present the first comprehensive survey on multimodal sarcasm detection to date.
arXiv Detail & Related papers (2024-10-24T16:17:47Z) - Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection [12.744170917349287]
This study presents a novel framework for multimodal sarcasm detection that can process input triplets.
The proposed model achieves the best accuracy of 92.89% and 64.48%, respectively, on the Twitter multimodal sarcasm and MultiBully datasets.
arXiv Detail & Related papers (2024-08-05T16:07:31Z) - Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue [67.09698638709065]
We propose a novel sEntiment-enhanceD Graph-based multimodal sarcasm Explanation framework, named EDGE.
In particular, we first propose a lexicon-guided utterance sentiment inference module, where a utterance sentiment refinement strategy is devised.
We then develop a module named Joint Cross Attention-based Sentiment Inference (JCA-SI) by extending the multimodal sentiment analysis model JCA to derive the joint sentiment label for each video-audio clip.
arXiv Detail & Related papers (2024-02-06T03:14:46Z) - Multi-source Semantic Graph-based Multimodal Sarcasm Explanation
Generation [53.97962603641629]
We propose a novel mulTi-source sEmantic grAph-based Multimodal sarcasm explanation scheme, named TEAM.
TEAM extracts the object-level semantic meta-data instead of the traditional global visual features from the input image.
TEAM introduces a multi-source semantic graph that comprehensively characterize the multi-source semantic relations.
arXiv Detail & Related papers (2023-06-29T03:26:10Z) - Sarcasm Detection Framework Using Emotion and Sentiment Features [62.997667081978825]
We propose a model which incorporates emotion and sentiment features to capture the incongruity intrinsic to sarcasm.
Our approach achieved state-of-the-art results on four datasets from social networking platforms and online media.
arXiv Detail & Related papers (2022-11-23T15:14:44Z) - How to Describe Images in a More Funny Way? Towards a Modular Approach
to Cross-Modal Sarcasm Generation [62.89586083449108]
We study a new problem of cross-modal sarcasm generation (CMSG), i.e., generating a sarcastic description for a given image.
CMSG is challenging as models need to satisfy the characteristics of sarcasm, as well as the correlation between different modalities.
We propose an Extraction-Generation-Ranking based Modular method (EGRM) for cross-model sarcasm generation.
arXiv Detail & Related papers (2022-11-20T14:38:24Z) - Multi-Modal Sarcasm Detection Based on Contrastive Attention Mechanism [7.194040730138362]
We construct a Contras-tive-Attention-based Sarcasm Detection (ConAttSD) model, which uses an inter-modality contrastive attention mechanism to extract contrastive features for an utterance.
Our experiments on MUStARD, a benchmark multi-modal sarcasm dataset, demonstrate the effectiveness of the proposed ConAttSD model.
arXiv Detail & Related papers (2021-09-30T14:17:51Z) - Interpretable Multi-Head Self-Attention model for Sarcasm Detection in
social media [0.0]
Inherent ambiguity in sarcastic expressions, make sarcasm detection very difficult.
We develop an interpretable deep learning model using multi-head self-attention and gated recurrent units.
We show the effectiveness of our approach by achieving state-of-the-art results on multiple datasets.
arXiv Detail & Related papers (2021-01-14T21:39:35Z) - Sarcasm Detection using Context Separators in Online Discourse [3.655021726150369]
Sarcasm is an intricate form of speech, where meaning is conveyed implicitly.
In this work, we use RoBERTa_large to detect sarcasm in two datasets.
We also assert the importance of context in improving the performance of contextual word embedding models.
arXiv Detail & Related papers (2020-06-01T10:52:35Z) - $R^3$: Reverse, Retrieve, and Rank for Sarcasm Generation with
Commonsense Knowledge [51.70688120849654]
We propose an unsupervised approach for sarcasm generation based on a non-sarcastic input sentence.
Our method employs a retrieve-and-edit framework to instantiate two major characteristics of sarcasm.
arXiv Detail & Related papers (2020-04-28T02:30:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.