Multi-Modal Sarcasm Detection Based on Contrastive Attention Mechanism
- URL: http://arxiv.org/abs/2109.15153v1
- Date: Thu, 30 Sep 2021 14:17:51 GMT
- Title: Multi-Modal Sarcasm Detection Based on Contrastive Attention Mechanism
- Authors: Xiaoqiang Zhang, Ying Chen, Guangyuan Li
- Abstract summary: We construct a Contras-tive-Attention-based Sarcasm Detection (ConAttSD) model, which uses an inter-modality contrastive attention mechanism to extract contrastive features for an utterance.
Our experiments on MUStARD, a benchmark multi-modal sarcasm dataset, demonstrate the effectiveness of the proposed ConAttSD model.
- Score: 7.194040730138362
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the past decade, sarcasm detection has been intensively conducted in a
textual scenario. With the popularization of video communication, the analysis
in multi-modal scenarios has received much attention in recent years.
Therefore, multi-modal sarcasm detection, which aims at detecting sarcasm in
video conversations, becomes increasingly hot in both the natural language
processing community and the multi-modal analysis community. In this paper,
considering that sarcasm is often conveyed through incongruity between
modalities (e.g., text expressing a compliment while acoustic tone indicating a
grumble), we construct a Contras-tive-Attention-based Sarcasm Detection
(ConAttSD) model, which uses an inter-modality contrastive attention mechanism
to extract several contrastive features for an utterance. A contrastive feature
represents the incongruity of information between two modalities. Our
experiments on MUStARD, a benchmark multi-modal sarcasm dataset, demonstrate
the effectiveness of the proposed ConAttSD model.
Related papers
- A Survey of Multimodal Sarcasm Detection [32.659528422756416]
Sarcasm is a rhetorical device that is used to convey the opposite of the literal meaning of an utterance.
We present the first comprehensive survey on multimodal sarcasm detection to date.
arXiv Detail & Related papers (2024-10-24T16:17:47Z) - CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models [14.453131020178564]
This paper proposes a versatile MSTI framework with a coarse-to-fine paradigm, by augmenting sarcasm explainability with reasoning and pre-training knowledge.
Inspired by the powerful capacity of Large Multimodal Models (LMMs) on multimodal reasoning, we first engage LMMs to generate competing rationales for coarser-grained pre-training of a small language model on multimodal sarcasm detection.
We then propose fine-tuning the model for finer-grained sarcasm target identification. Our framework is thus empowered to adeptly unveil the intricate targets within multimodal sarcasm and mitigate the negative impact posed by potential noise inherently in LMMs.
arXiv Detail & Related papers (2024-05-01T08:44:44Z) - Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue [67.09698638709065]
We propose a novel sEntiment-enhanceD Graph-based multimodal sarcasm Explanation framework, named EDGE.
In particular, we first propose a lexicon-guided utterance sentiment inference module, where a utterance sentiment refinement strategy is devised.
We then develop a module named Joint Cross Attention-based Sentiment Inference (JCA-SI) by extending the multimodal sentiment analysis model JCA to derive the joint sentiment label for each video-audio clip.
arXiv Detail & Related papers (2024-02-06T03:14:46Z) - MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction Experts [92.76662894585809]
We introduce an approach to enhance multimodal models, which we call Multimodal Mixtures of Experts (MMoE)
MMoE is able to be applied to various types of models to gain improvement.
arXiv Detail & Related papers (2023-11-16T05:31:21Z) - MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection System [57.650338588086186]
We introduce MMSD2.0, a correction dataset that fixes the shortcomings of MMSD.
We present a novel framework called multi-view CLIP that is capable of leveraging multi-grained cues from multiple perspectives.
arXiv Detail & Related papers (2023-07-14T03:22:51Z) - Sarcasm Detection Framework Using Emotion and Sentiment Features [62.997667081978825]
We propose a model which incorporates emotion and sentiment features to capture the incongruity intrinsic to sarcasm.
Our approach achieved state-of-the-art results on four datasets from social networking platforms and online media.
arXiv Detail & Related papers (2022-11-23T15:14:44Z) - Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity
Modeling with Knowledge Enhancement [31.97249246223621]
Sarcasm is a linguistic phenomenon indicating a discrepancy between literal meanings and implied intentions.
Most existing techniques only modeled the atomic-level inconsistencies between the text input and its accompanying image.
We propose a novel hierarchical framework for sarcasm detection by exploring both the atomic-level congruity based on multi-head cross attention mechanism and the composition-level congruity based on graph neural networks.
arXiv Detail & Related papers (2022-10-07T12:44:33Z) - Multimodal Learning using Optimal Transport for Sarcasm and Humor
Detection [76.62550719834722]
We deal with multimodal sarcasm and humor detection from conversational videos and image-text pairs.
We propose a novel multimodal learning system, MuLOT, which utilizes self-attention to exploit intra-modal correspondence.
We test our approach for multimodal sarcasm and humor detection on three benchmark datasets.
arXiv Detail & Related papers (2021-10-21T07:51:56Z) - Interpretable Multi-Head Self-Attention model for Sarcasm Detection in
social media [0.0]
Inherent ambiguity in sarcastic expressions, make sarcasm detection very difficult.
We develop an interpretable deep learning model using multi-head self-attention and gated recurrent units.
We show the effectiveness of our approach by achieving state-of-the-art results on multiple datasets.
arXiv Detail & Related papers (2021-01-14T21:39:35Z) - Bi-ISCA: Bidirectional Inter-Sentence Contextual Attention Mechanism for
Detecting Sarcasm in User Generated Noisy Short Text [8.36639545285691]
This paper proposes a new state-of-the-art deep learning architecture that uses a novel Bidirectional Inter-Sentence Contextual Attention mechanism (Bi-ISCA)
Bi-ISCA captures inter-sentence dependencies for detecting sarcasm in the user-generated short text using only the conversational context.
The proposed deep learning model demonstrates the capability to capture explicit, implicit, and contextual incongruous words & phrases responsible for invoking sarcasm.
arXiv Detail & Related papers (2020-11-23T15:24:27Z) - $R^3$: Reverse, Retrieve, and Rank for Sarcasm Generation with
Commonsense Knowledge [51.70688120849654]
We propose an unsupervised approach for sarcasm generation based on a non-sarcastic input sentence.
Our method employs a retrieve-and-edit framework to instantiate two major characteristics of sarcasm.
arXiv Detail & Related papers (2020-04-28T02:30:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.