Related papers: Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity Modeling with Knowledge Enhancement

Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity Modeling with Knowledge Enhancement

URL: http://arxiv.org/abs/2210.03501v1
Date: Fri, 7 Oct 2022 12:44:33 GMT
Title: Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity Modeling with Knowledge Enhancement
Authors: Hui Liu, Wenya Wang, Haoliang Li
Abstract summary: Sarcasm is a linguistic phenomenon indicating a discrepancy between literal meanings and implied intentions. Most existing techniques only modeled the atomic-level inconsistencies between the text input and its accompanying image. We propose a novel hierarchical framework for sarcasm detection by exploring both the atomic-level congruity based on multi-head cross attention mechanism and the composition-level congruity based on graph neural networks.
Score: 31.97249246223621
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sarcasm is a linguistic phenomenon indicating a discrepancy between literal meanings and implied intentions. Due to its sophisticated nature, it is usually challenging to be detected from the text itself. As a result, multi-modal sarcasm detection has received more attention in both academia and industries. However, most existing techniques only modeled the atomic-level inconsistencies between the text input and its accompanying image, ignoring more complex compositions for both modalities. Moreover, they neglected the rich information contained in external knowledge, e.g., image captions. In this paper, we propose a novel hierarchical framework for sarcasm detection by exploring both the atomic-level congruity based on multi-head cross attention mechanism and the composition-level congruity based on graph neural networks, where a post with low congruity can be identified as sarcasm. In addition, we exploit the effect of various knowledge resources for sarcasm detection. Evaluation results on a public multi-modal sarcasm detection dataset based on Twitter demonstrate the superiority of our proposed model.

Related papers

RCLMuFN: Relational Context Learning and Multiplex Fusion Network for Multimodal Sarcasm Detection [1.023096557577223]
We propose a relational context learning and multiplex fusion network (RCLMuFN) for multimodal sarcasm detection. Firstly, we employ four feature extractors to comprehensively extract features from raw text and images. Secondly, we utilize the relational context learning module to learn the contextual information of text and images.
arXiv Detail & Related papers (2024-12-17T15:29:31Z)
Detecting Emotional Incongruity of Sarcasm by Commonsense Reasoning [32.5690489394632]
This paper focuses on sarcasm detection, which aims to identify whether given statements convey criticism, mockery, or other negative sentiment opposite to the literal meaning. Existing methods lack commonsense inferential ability when they face complex real-world scenarios, leading to unsatisfactory performance. We propose a novel framework for sarcasm detection, which conducts incongruity reasoning based on commonsense augmentation, called EICR.
arXiv Detail & Related papers (2024-12-17T11:25:55Z)
A Survey of Multimodal Sarcasm Detection [32.659528422756416]
Sarcasm is a rhetorical device that is used to convey the opposite of the literal meaning of an utterance. We present the first comprehensive survey on multimodal sarcasm detection to date.
arXiv Detail & Related papers (2024-10-24T16:17:47Z)
Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection [12.744170917349287]
This study presents a novel framework for multimodal sarcasm detection that can process input triplets. The proposed model achieves the best accuracy of 92.89% and 64.48%, respectively, on the Twitter multimodal sarcasm and MultiBully datasets.
arXiv Detail & Related papers (2024-08-05T16:07:31Z)
Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue [67.09698638709065]
We propose a novel sEntiment-enhanceD Graph-based multimodal sarcasm Explanation framework, named EDGE. In particular, we first propose a lexicon-guided utterance sentiment inference module, where a utterance sentiment refinement strategy is devised. We then develop a module named Joint Cross Attention-based Sentiment Inference (JCA-SI) by extending the multimodal sentiment analysis model JCA to derive the joint sentiment label for each video-audio clip.
arXiv Detail & Related papers (2024-02-06T03:14:46Z)
Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation [53.97962603641629]
We propose a novel mulTi-source sEmantic grAph-based Multimodal sarcasm explanation scheme, named TEAM. TEAM extracts the object-level semantic meta-data instead of the traditional global visual features from the input image. TEAM introduces a multi-source semantic graph that comprehensively characterize the multi-source semantic relations.
arXiv Detail & Related papers (2023-06-29T03:26:10Z)
Sarcasm Detection Framework Using Emotion and Sentiment Features [62.997667081978825]
We propose a model which incorporates emotion and sentiment features to capture the incongruity intrinsic to sarcasm. Our approach achieved state-of-the-art results on four datasets from social networking platforms and online media.
arXiv Detail & Related papers (2022-11-23T15:14:44Z)
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation [62.89586083449108]
We study a new problem of cross-modal sarcasm generation (CMSG), i.e., generating a sarcastic description for a given image. CMSG is challenging as models need to satisfy the characteristics of sarcasm, as well as the correlation between different modalities. We propose an Extraction-Generation-Ranking based Modular method (EGRM) for cross-model sarcasm generation.
arXiv Detail & Related papers (2022-11-20T14:38:24Z)
Multi-Modal Sarcasm Detection Based on Contrastive Attention Mechanism [7.194040730138362]
We construct a Contras-tive-Attention-based Sarcasm Detection (ConAttSD) model, which uses an inter-modality contrastive attention mechanism to extract contrastive features for an utterance. Our experiments on MUStARD, a benchmark multi-modal sarcasm dataset, demonstrate the effectiveness of the proposed ConAttSD model.
arXiv Detail & Related papers (2021-09-30T14:17:51Z)
Interpretable Multi-Head Self-Attention model for Sarcasm Detection in social media [0.0]
Inherent ambiguity in sarcastic expressions, make sarcasm detection very difficult. We develop an interpretable deep learning model using multi-head self-attention and gated recurrent units. We show the effectiveness of our approach by achieving state-of-the-art results on multiple datasets.
arXiv Detail & Related papers (2021-01-14T21:39:35Z)
Sarcasm Detection using Context Separators in Online Discourse [3.655021726150369]
Sarcasm is an intricate form of speech, where meaning is conveyed implicitly. In this work, we use RoBERTa_large to detect sarcasm in two datasets. We also assert the importance of context in improving the performance of contextual word embedding models.
arXiv Detail & Related papers (2020-06-01T10:52:35Z)
$R^3$: Reverse, Retrieve, and Rank for Sarcasm Generation with Commonsense Knowledge [51.70688120849654]
We propose an unsupervised approach for sarcasm generation based on a non-sarcastic input sentence. Our method employs a retrieve-and-edit framework to instantiate two major characteristics of sarcasm.
arXiv Detail & Related papers (2020-04-28T02:30:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.