Related papers: Detecting Hate and Inflammatory Content in Bengali Memes: A New Multimodal Dataset and Co-Attention Framework

Detecting Hate and Inflammatory Content in Bengali Memes: A New Multimodal Dataset and Co-Attention Framework

URL: http://arxiv.org/abs/2602.22391v1
Date: Wed, 25 Feb 2026 20:40:25 GMT
Title: Detecting Hate and Inflammatory Content in Bengali Memes: A New Multimodal Dataset and Co-Attention Framework
Authors: Rakib Ullah, Mominul islam, Md Sanjid Hossain, Md Ismail Hossain,
Abstract summary: We introduce Bn-HIB (Bangla Hate Inflammatory Benign), a novel dataset containing 3,247 manually annotated Bengali memes.<n>Bn-HIB is the first dataset to distinguish inflammatory content from direct hate speech in Bengali memes.<n>We propose the MCFM (Multi-Modal Co-Attention Fusion Model), a simple yet effective architecture that mutually analyzes both the visual and textual elements of a meme.
Score: 0.1499944454332829
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Internet memes have become a dominant form of expression on social media, including within the Bengali-speaking community. While often humorous, memes can also be exploited to spread offensive, harmful, and inflammatory content targeting individuals and groups. Detecting this type of content is excep- tionally challenging due to its satirical, subtle, and culturally specific nature. This problem is magnified for low-resource lan- guages like Bengali, as existing research predominantly focuses on high-resource languages. To address this critical research gap, we introduce Bn-HIB (Bangla Hate Inflammatory Benign), a novel dataset containing 3,247 manually annotated Bengali memes categorized as Benign, Hate, or Inflammatory. Significantly, Bn- HIB is the first dataset to distinguish inflammatory content from direct hate speech in Bengali memes. Furthermore, we propose the MCFM (Multi-Modal Co-Attention Fusion Model), a simple yet effective architecture that mutually analyzes both the visual and textual elements of a meme. MCFM employs a co-attention mechanism to identify and fuse the most critical features from each modality, leading to a more accurate classification. Our experiments show that MCFM significantly outperforms several state-of-the-art models on the Bn-HIB dataset, demonstrating its effectiveness in this nuanced task.Warning: This work contains material that may be disturbing to some audience members. Viewer discretion is advised.

Related papers

MemeReaCon: Probing Contextual Meme Understanding in Large Vision-Language Models [50.2355423914562]
We introduce MemeReaCon, a novel benchmark designed to evaluate how Large Vision Language Models (LVLMs) understand memes in their original context.<n>We collected memes from five different Reddit communities, keeping each meme's image, the post text, and user comments together.<n>Our tests with leading LVLMs show a clear weakness: models either fail to interpret critical information in the contexts, or overly focus on visual details while overlooking communicative purpose.
arXiv Detail & Related papers (2025-05-23T03:27:23Z)
MemeBLIP2: A novel lightweight multimodal system to detect harmful memes [10.174106475035689]
We introduce MemeBLIP2, a light weight multimodal system that detects harmful memes by combining image and text features effectively.<n>We build on previous studies by adding modules that align image and text representations into a shared space and fuse them for better classification.<n>The results show that MemeBLIP2 can capture subtle cues in both modalities, even in cases with ironic or culturally specific content.
arXiv Detail & Related papers (2025-04-29T23:41:06Z)
MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing [53.30190591805432]
We introduce MemeMQA, a multimodal question-answering framework to solicit accurate responses to structured questions. We also propose ARSENAL, a novel two-stage multimodal framework to address MemeMQA.
arXiv Detail & Related papers (2024-05-18T07:44:41Z)
Deciphering Hate: Identifying Hateful Memes and Their Targets [4.574830585715128]
We introduce a novel dataset for detecting hateful memes in Bengali, BHM (Bengali Hateful Memes) The dataset consists of 7,148 memes with Bengali as well as code-mixed captions, tailored for two tasks: (i) detecting hateful memes, and (ii) detecting the social entities they target. To solve these tasks, we propose DORA, a multimodal deep neural network that systematically extracts the significant modality features from the memes.
arXiv Detail & Related papers (2024-03-16T06:39:41Z)
Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations [48.82168723932981]
We introduce em MultiBully-Ex, the first benchmark dataset for multimodal explanation from code-mixed cyberbullying memes. A Contrastive Language-Image Pretraining (CLIP) approach has been proposed for visual and textual explanation of a meme.
arXiv Detail & Related papers (2024-01-18T11:24:30Z)
Explainable Multimodal Sentiment Analysis on Bengali Memes [0.0]
Understanding and interpreting the sentiment underlying memes has become crucial in the age of information. This study employed a multimodal approach using ResNet50 and BanglishBERT and achieved a satisfactory result of 0.71 weighted F1-score.
arXiv Detail & Related papers (2023-12-20T17:15:10Z)
BanglaAbuseMeme: A Dataset for Bengali Abusive Meme Classification [11.04522597948877]
A simple yet effective way of abusing individuals or communities is by creating memes. Such harmful elements are in rampant use and are a threat to online safety. It is necessary to develop efficient models to detect and flag abusive memes.
arXiv Detail & Related papers (2023-10-18T07:10:47Z)
DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation [72.18912216025029]
We present DisinfoMeme to help detect disinformation memes. The dataset contains memes mined from Reddit covering three current topics: the COVID-19 pandemic, the Black Lives Matter movement, and veganism/vegetarianism.
arXiv Detail & Related papers (2022-05-25T09:54:59Z)
DISARM: Detecting the Victims Targeted by Harmful Memes [49.12165815990115]
DISARM is a framework that uses named entity recognition and person identification to detect harmful memes. We show that DISARM significantly outperforms ten unimodal and multimodal systems. It can reduce the relative error rate for harmful target identification by up to 9 points absolute over several strong multimodal rivals.
arXiv Detail & Related papers (2022-05-11T19:14:26Z)
Detecting and Understanding Harmful Memes: A Survey [48.135415967633676]
We offer a comprehensive survey with a focus on harmful memes. One interesting finding is that many types of harmful memes are not really studied, e.g., such featuring self-harm and extremism. Another observation is that memes can propagate globally through repackaging in different languages and that they can also be multilingual.
arXiv Detail & Related papers (2022-05-09T13:43:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.