Related papers: HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes

HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes

URL: http://arxiv.org/abs/2408.05794v1
Date: Sun, 11 Aug 2024 14:56:06 GMT
Title: HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes
Authors: Xuanyu Su, Yansong Li, Diana Inkpen, Nathalie Japkowicz,
Abstract summary: textscHateSieve is a framework designed to enhance the detection and segmentation of hateful elements in memes. textscHateSieve features a novel Contrastive Meme Generator that creates semantically paired memes. Empirical experiments on the Hateful Meme show that textscHateSieve not only surpasses existing LMMs in performance with fewer trainable parameters but also offers a robust mechanism for precisely identifying and isolating hateful content.
Score: 8.97062933976566
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Amidst the rise of Large Multimodal Models (LMMs) and their widespread application in generating and interpreting complex content, the risk of propagating biased and harmful memes remains significant. Current safety measures often fail to detect subtly integrated hateful content within ``Confounder Memes''. To address this, we introduce \textsc{HateSieve}, a new framework designed to enhance the detection and segmentation of hateful elements in memes. \textsc{HateSieve} features a novel Contrastive Meme Generator that creates semantically paired memes, a customized triplet dataset for contrastive learning, and an Image-Text Alignment module that produces context-aware embeddings for accurate meme segmentation. Empirical experiments on the Hateful Meme Dataset show that \textsc{HateSieve} not only surpasses existing LMMs in performance with fewer trainable parameters but also offers a robust mechanism for precisely identifying and isolating hateful content. \textcolor{red}{Caution: Contains academic discussions of hate speech; viewer discretion advised.}

Related papers

Detecting Harmful Memes with Decoupled Understanding and Guided CoT Reasoning [26.546646866501735]
We introduce U-CoT+, a novel framework for harmful meme detection.<n>We first develop a high-fidelity meme-to-text pipeline that converts visual memes into detail-preserving textual descriptions.<n>This design decouples meme interpretation from meme classification, thus avoiding immediate reasoning over complex raw visual content.
arXiv Detail & Related papers (2025-06-10T06:10:45Z)
Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models [12.929357709840975]
Multimodal memes are sometimes misused to disseminate hate speech against individuals or groups. We propose a definition-guided prompting technique for detecting hateful memes, and a unified framework for mitigating hateful content in memes, named UnHateMeme. Our framework, integrated with Vision-Language Models, demonstrates a strong capability to convert hateful memes into non-hateful forms.
arXiv Detail & Related papers (2025-04-30T19:48:12Z)
MemeBLIP2: A novel lightweight multimodal system to detect harmful memes [10.174106475035689]
We introduce MemeBLIP2, a light weight multimodal system that detects harmful memes by combining image and text features effectively. We build on previous studies by adding modules that align image and text representations into a shared space and fuse them for better classification. The results show that MemeBLIP2 can capture subtle cues in both modalities, even in cases with ironic or culturally specific content.
arXiv Detail & Related papers (2025-04-29T23:41:06Z)
Detecting and Understanding Hateful Contents in Memes Through Captioning and Visual Question-Answering [0.5587293092389789]
Hateful memes often evade traditional text-only or image-only detection systems, particularly when they employ subtle or coded references. We propose a multimodal hate detection framework that integrates OCR to extract embedded text, captioning to describe visual content neutrally, sub-label classification for granular categorization, RAG for contextually relevant retrieval, and VQA for iterative analysis of symbolic and contextual cues. Experimental results on the Facebook Hateful Memes dataset reveal that the proposed framework exceeds the performance of unimodal and conventional multimodal models in both accuracy and AUC-ROC.
arXiv Detail & Related papers (2025-04-23T13:52:14Z)
Demystifying Hateful Content: Leveraging Large Multimodal Models for Hateful Meme Detection with Explainable Decisions [4.649093665157263]
In this paper, we introduce IntMeme, a novel framework that leverages Large Multimodal Models (LMMs) for hateful meme classification with explainable decisions. IntMeme addresses the dual challenges of improving both accuracy and explainability in meme moderation. Our approach addresses the opacity and misclassification issues associated with PT-VLMs, optimizing the use of LMMs for hateful meme detection.
arXiv Detail & Related papers (2025-02-16T10:45:40Z)
Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search [64.15205542003056]
We introduce Attention-Guided Alignment (AGA) framework featuring two innovative components: Attention-Guided Mask (AGM) Modeling and Text Enrichment Module (TEM) AGA achieves new state-of-the-art results with Rank-1 accuracy reaching 78.36%, 67.31%, and 67.4% on CUHK-PEDES, ICFG-PEDES, and RSTP, respectively.
arXiv Detail & Related papers (2024-12-19T17:51:49Z)
XMeCap: Meme Caption Generation with Sub-Image Adaptability [53.2509590113364]
Humor, deeply rooted in societal meanings and cultural details, poses a unique challenge for machines. We introduce the textscXMeCap framework, which adopts supervised fine-tuning and reinforcement learning. textscXMeCap achieves an average evaluation score of 75.85 for single-image memes and 66.32 for multi-image memes, outperforming the best baseline by 3.71% and 4.82%, respectively.
arXiv Detail & Related papers (2024-07-24T10:51:46Z)
Multimodal Unlearnable Examples: Protecting Data against Multimodal Contrastive Learning [53.766434746801366]
Multimodal contrastive learning (MCL) has shown remarkable advances in zero-shot classification by learning from millions of image-caption pairs crawled from the Internet. Hackers may unauthorizedly exploit image-text data for model training, potentially including personal and privacy-sensitive information. Recent works propose generating unlearnable examples by adding imperceptible perturbations to training images to build shortcuts for protection. We propose Multi-step Error Minimization (MEM), a novel optimization process for generating multimodal unlearnable examples.
arXiv Detail & Related papers (2024-07-23T09:00:52Z)
MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention [43.849634264271565]
We present textitMemeGuard, a comprehensive framework leveraging Large Language Models (LLMs) and Visual Language Models (VLMs) for meme intervention. textitMemeGuard harnesses a specially fine-tuned VLM, textitVLMeme, for meme interpretation, and a multimodal knowledge selection and ranking mechanism. We leverage textitICMM to test textitMemeGuard, demonstrating its proficiency in generating relevant and effective responses to toxic memes.
arXiv Detail & Related papers (2024-06-08T04:09:20Z)
MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing [53.30190591805432]
We introduce MemeMQA, a multimodal question-answering framework to solicit accurate responses to structured questions. We also propose ARSENAL, a novel two-stage multimodal framework to address MemeMQA.
arXiv Detail & Related papers (2024-05-18T07:44:41Z)
Text or Image? What is More Important in Cross-Domain Generalization Capabilities of Hate Meme Detection Models? [2.4899077941924967]
This paper delves into the formidable challenge of cross-domain generalization in multimodal hate meme detection. We provide enough pieces of evidence supporting the hypothesis that only the textual component of hateful memes enables the existing multimodal classifier to generalize across different domains. Our evaluation on a newly created confounder dataset reveals higher performance on text confounders as compared to image confounders with an average $Delta$F1 of 0.18.
arXiv Detail & Related papers (2024-02-07T15:44:55Z)
Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations [48.82168723932981]
We introduce em MultiBully-Ex, the first benchmark dataset for multimodal explanation from code-mixed cyberbullying memes. A Contrastive Language-Image Pretraining (CLIP) approach has been proposed for visual and textual explanation of a meme.
arXiv Detail & Related papers (2024-01-18T11:24:30Z)
MemeFier: Dual-stage Modality Fusion for Image Meme Classification [8.794414326545697]
New forms of digital content such as image memes have given rise to spread of hate using multimodal means. We propose MemeFier, a deep learning-based architecture for fine-grained classification of Internet image memes.
arXiv Detail & Related papers (2023-04-06T07:36:52Z)
SemiMemes: A Semi-supervised Learning Approach for Multimodal Memes Analysis [0.0]
SemiMemes is a novel training method that combines auto-encoder and classification task to make use of the resourceful unlabeled data. This research proposes a multimodal semi-supervised learning approach that outperforms other multimodal semi-supervised learning and supervised learning state-of-the-art models.
arXiv Detail & Related papers (2023-03-31T11:22:03Z)
What do you MEME? Generating Explanations for Visual Semantic Role Labelling in Memes [42.357272117919464]
We introduce a novel task - EXCLAIM, generating explanations for visual semantic role labeling in memes. To this end, we curate ExHVV, a novel dataset that offers natural language explanations of connotative roles for three types of entities. We also posit LUMEN, a novel multimodal, multi-task learning framework that endeavors to address EXCLAIM optimally.
arXiv Detail & Related papers (2022-12-01T18:21:36Z)
DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation [72.18912216025029]
We present DisinfoMeme to help detect disinformation memes. The dataset contains memes mined from Reddit covering three current topics: the COVID-19 pandemic, the Black Lives Matter movement, and veganism/vegetarianism.
arXiv Detail & Related papers (2022-05-25T09:54:59Z)
Detecting and Understanding Harmful Memes: A Survey [48.135415967633676]
We offer a comprehensive survey with a focus on harmful memes. One interesting finding is that many types of harmful memes are not really studied, e.g., such featuring self-harm and extremism. Another observation is that memes can propagate globally through repackaging in different languages and that they can also be multilingual.
arXiv Detail & Related papers (2022-05-09T13:43:27Z)
Caption Enriched Samples for Improving Hateful Memes Detection [78.5136090997431]
The hateful meme challenge demonstrates the difficulty of determining whether a meme is hateful or not. Both unimodal language models and multimodal vision-language models cannot reach the human level of performance.
arXiv Detail & Related papers (2021-09-22T10:57:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.