Related papers: Emotion-Aware Multimodal Fusion for Meme Emotion Detection

Emotion-Aware Multimodal Fusion for Meme Emotion Detection

URL: http://arxiv.org/abs/2403.10279v1
Date: Fri, 15 Mar 2024 13:20:38 GMT
Title: Emotion-Aware Multimodal Fusion for Meme Emotion Detection
Authors: Shivam Sharma, Ramaneswaran S, Md. Shad Akhtar, Tanmoy Chakraborty,
Abstract summary: MOOD (Meme emOtiOns dataset) embodies six basic emotions. ALFRED (emotion-Aware muLtimodal Fusion foR Emotion Detection) explicitly models emotion-enriched visual cues.
Score: 37.86468816979694
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The ever-evolving social media discourse has witnessed an overwhelming use of memes to express opinions or dissent. Besides being misused for spreading malcontent, they are mined by corporations and political parties to glean the public's opinion. Therefore, memes predominantly offer affect-enriched insights towards ascertaining the societal psyche. However, the current approaches are yet to model the affective dimensions expressed in memes effectively. They rely extensively on large multimodal datasets for pre-training and do not generalize well due to constrained visual-linguistic grounding. In this paper, we introduce MOOD (Meme emOtiOns Dataset), which embodies six basic emotions. We then present ALFRED (emotion-Aware muLtimodal Fusion foR Emotion Detection), a novel multimodal neural framework that (i) explicitly models emotion-enriched visual cues, and (ii) employs an efficient cross-modal fusion via a gating mechanism. Our investigation establishes ALFRED's superiority over existing baselines by 4.94% F1. Additionally, ALFRED competes strongly with previous best approaches on the challenging Memotion task. We then discuss ALFRED's domain-agnostic generalizability by demonstrating its dominance on two recently-released datasets - HarMeme and Dank Memes, over other baselines. Further, we analyze ALFRED's interpretability using attention maps. Finally, we highlight the inherent challenges posed by the complex interplay of disparate modality-specific cues toward meme analysis.

Related papers

Enhancing Meme Emotion Understanding with Multi-Level Modality Enhancement and Dual-Stage Modal Fusion [18.557896531533043]
We propose MemoDetector, a novel framework for advancing Meme Emotion Understanding (MEU)<n>Our method consistently outperforms state-of-the-art baselines. Specifically, MemoDetector improves F1 scores by 4.3% on MET-MEME and 3.4% on MOOD.
arXiv Detail & Related papers (2025-11-14T09:59:08Z)
MEGC2025: Micro-Expression Grand Challenge on Spot Then Recognize and Visual Question Answering [55.30507585676142]
Facial micro-expressions (MEs) are involuntary movements of the face that occur spontaneously when a person experiences an emotion.<n>In recent years, substantial advancements have been made in the areas of ME recognition, spotting, and generation.<n>The ME grand challenge (MEGC) 2025 introduces two tasks that reflect these evolving research directions.
arXiv Detail & Related papers (2025-06-18T09:29:51Z)
Meme Similarity and Emotion Detection using Multimodal Analysis [0.0]
This study employs a multimodal methodological approach, analyzing both the visual and textual elements of memes. We extract low-level visual features and high-level semantic features to identify similar meme pairs. Results indicate that anger and joy are the dominant emotions in memes, with motivational memes eliciting stronger emotional responses.
arXiv Detail & Related papers (2025-03-21T19:07:16Z)
Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations [48.82168723932981]
We introduce em MultiBully-Ex, the first benchmark dataset for multimodal explanation from code-mixed cyberbullying memes. A Contrastive Language-Image Pretraining (CLIP) approach has been proposed for visual and textual explanation of a meme.
arXiv Detail & Related papers (2024-01-18T11:24:30Z)
DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation [72.18912216025029]
We present DisinfoMeme to help detect disinformation memes. The dataset contains memes mined from Reddit covering three current topics: the COVID-19 pandemic, the Black Lives Matter movement, and veganism/vegetarianism.
arXiv Detail & Related papers (2022-05-25T09:54:59Z)
DISARM: Detecting the Victims Targeted by Harmful Memes [49.12165815990115]
DISARM is a framework that uses named entity recognition and person identification to detect harmful memes. We show that DISARM significantly outperforms ten unimodal and multimodal systems. It can reduce the relative error rate for harmful target identification by up to 9 points absolute over several strong multimodal rivals.
arXiv Detail & Related papers (2022-05-11T19:14:26Z)
Detecting and Understanding Harmful Memes: A Survey [48.135415967633676]
We offer a comprehensive survey with a focus on harmful memes. One interesting finding is that many types of harmful memes are not really studied, e.g., such featuring self-harm and extremism. Another observation is that memes can propagate globally through repackaging in different languages and that they can also be multilingual.
arXiv Detail & Related papers (2022-05-09T13:43:27Z)
Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge [10.775419935941008]
We assess the efficacy of current state-of-the-art multimodal machine learning models toward hateful meme detection. We use two benchmark datasets comprising 12,140 and 10,567 images from 4chan's "Politically Incorrect" board (/pol/) and Facebook's Hateful Memes Challenge dataset. We conduct three experiments to determine the importance of multimodality on classification performance, the influential capacity of fringe Web communities on mainstream social platforms and vice versa.
arXiv Detail & Related papers (2022-02-17T07:52:22Z)
MOMENTA: A Multimodal Framework for Detecting Harmful Memes and Their Targets [28.877314859737197]
We aim to solve two novel tasks: detecting harmful memes and identifying the social entities they target. In particular, we aim to solve two novel tasks: detecting harmful memes and identifying the social entities they target. We propose MOMENTA, a novel multimodal (text + image) deep neural model, which uses global and local perspectives to detect harmful memes.
arXiv Detail & Related papers (2021-09-11T04:29:32Z)
Emotion Recognition from Multiple Modalities: Fundamentals and Methodologies [106.62835060095532]
We discuss several key aspects of multi-modal emotion recognition (MER) We begin with a brief introduction on widely used emotion representation models and affective modalities. We then summarize existing emotion annotation strategies and corresponding computational tasks. Finally, we outline several real-world applications and discuss some future directions.
arXiv Detail & Related papers (2021-08-18T21:55:20Z)
SemEval-2020 Task 8: Memotion Analysis -- The Visuo-Lingual Metaphor! [20.55903557920223]
The objective of this proposal is to bring the attention of the research community towards the automatic processing of Internet memes. The task Memotion analysis released approx 10K annotated memes, with human-annotated labels namely sentiment (positive, negative, neutral), type of emotion (sarcastic, funny, offensive, motivation) and corresponding intensity. The challenge consisted of three subtasks: sentiment (positive, negative, and neutral) analysis of memes, overall emotion (humour, sarcasm, offensive, and motivational) classification of memes, and classifying intensity of meme emotion.
arXiv Detail & Related papers (2020-08-09T18:17:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.