Related papers: Improving Hateful Meme Detection through Retrieval-Guided Contrastive Learning

Improving Hateful Meme Detection through Retrieval-Guided Contrastive Learning

URL: http://arxiv.org/abs/2311.08110v3
Date: Wed, 30 Oct 2024 12:34:16 GMT
Title: Improving Hateful Meme Detection through Retrieval-Guided Contrastive Learning
Authors: Jingbiao Mei, Jinghong Chen, Weizhe Lin, Bill Byrne, Marcus Tomalin,
Abstract summary: We propose constructing a hatefulness-aware embedding space through retrieval-guided contrastive training. Our approach achieves state-of-the-art performance on the HatefulMemes dataset with an AUROC of 87.0, outperforming much larger fine-tuned large multimodal models.
Score: 13.690436954062015
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hateful memes have emerged as a significant concern on the Internet. Detecting hateful memes requires the system to jointly understand the visual and textual modalities. Our investigation reveals that the embedding space of existing CLIP-based systems lacks sensitivity to subtle differences in memes that are vital for correct hatefulness classification. We propose constructing a hatefulness-aware embedding space through retrieval-guided contrastive training. Our approach achieves state-of-the-art performance on the HatefulMemes dataset with an AUROC of 87.0, outperforming much larger fine-tuned large multimodal models. We demonstrate a retrieval-based hateful memes detection system, which is capable of identifying hatefulness based on data unseen in training. This allows developers to update the hateful memes detection system by simply adding new examples without retraining, a desirable feature for real services in the constantly evolving landscape of hateful memes on the Internet.

Related papers

MemeReaCon: Probing Contextual Meme Understanding in Large Vision-Language Models [50.2355423914562]
We introduce MemeReaCon, a novel benchmark designed to evaluate how Large Vision Language Models (LVLMs) understand memes in their original context.<n>We collected memes from five different Reddit communities, keeping each meme's image, the post text, and user comments together.<n>Our tests with leading LVLMs show a clear weakness: models either fail to interpret critical information in the contexts, or overly focus on visual details while overlooking communicative purpose.
arXiv Detail & Related papers (2025-05-23T03:27:23Z)
Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models [12.929357709840975]
Multimodal memes are sometimes misused to disseminate hate speech against individuals or groups. We propose a definition-guided prompting technique for detecting hateful memes, and a unified framework for mitigating hateful content in memes, named UnHateMeme. Our framework, integrated with Vision-Language Models, demonstrates a strong capability to convert hateful memes into non-hateful forms.
arXiv Detail & Related papers (2025-04-30T19:48:12Z)
Improving Multimodal Hateful Meme Detection Exploiting LMM-Generated Knowledge [11.801596051153725]
detecting hateful content in memes has emerged as a task of critical importance. We propose to address the task leveraging knowledge encoded in powerful Large Multimodal Models (LMM) Specifically, we propose to exploit LMMs in a two-fold manner. First, by extracting knowledge oriented to the hateful meme detection task in order to build strong meme representations.
arXiv Detail & Related papers (2025-04-14T06:23:44Z)
Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection [49.122777764853055]
We explore the potential of Large Multimodal Models (LMMs) for hateful meme detection. We propose Evolver, which incorporates LMMs via Chain-of-Evolution (CoE) Prompting. Evolver simulates the evolving and expressing process of memes and reasons through LMMs in a step-by-step manner.
arXiv Detail & Related papers (2024-07-30T17:51:44Z)
Zero shot VLMs for hate meme detection: Are we there yet? [9.970031080934003]
This study investigates the efficacy of visual language models in handling intricate tasks such as hate meme detection. We observe that large VLMs are still vulnerable for zero-shot hate meme detection.
arXiv Detail & Related papers (2024-02-19T15:03:04Z)
Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations [48.82168723932981]
We introduce em MultiBully-Ex, the first benchmark dataset for multimodal explanation from code-mixed cyberbullying memes. A Contrastive Language-Image Pretraining (CLIP) approach has been proposed for visual and textual explanation of a meme.
arXiv Detail & Related papers (2024-01-18T11:24:30Z)
On the Evolution of (Hateful) Memes by Means of Multimodal Contrastive Learning [18.794226796466962]
We study how hateful memes are created by combining visual elements from multiple images or fusing textual information with a hateful image. Using our framework on a dataset extracted from 4chan, we find 3.3K variants of the Happy Merchant meme. We envision that our framework can be used to aid human moderators by flagging new variants of hateful memes.
arXiv Detail & Related papers (2022-12-13T13:38:04Z)
DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation [72.18912216025029]
We present DisinfoMeme to help detect disinformation memes. The dataset contains memes mined from Reddit covering three current topics: the COVID-19 pandemic, the Black Lives Matter movement, and veganism/vegetarianism.
arXiv Detail & Related papers (2022-05-25T09:54:59Z)
DISARM: Detecting the Victims Targeted by Harmful Memes [49.12165815990115]
DISARM is a framework that uses named entity recognition and person identification to detect harmful memes. We show that DISARM significantly outperforms ten unimodal and multimodal systems. It can reduce the relative error rate for harmful target identification by up to 9 points absolute over several strong multimodal rivals.
arXiv Detail & Related papers (2022-05-11T19:14:26Z)
Detecting and Understanding Harmful Memes: A Survey [48.135415967633676]
We offer a comprehensive survey with a focus on harmful memes. One interesting finding is that many types of harmful memes are not really studied, e.g., such featuring self-harm and extremism. Another observation is that memes can propagate globally through repackaging in different languages and that they can also be multilingual.
arXiv Detail & Related papers (2022-05-09T13:43:27Z)
Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge [10.775419935941008]
We assess the efficacy of current state-of-the-art multimodal machine learning models toward hateful meme detection. We use two benchmark datasets comprising 12,140 and 10,567 images from 4chan's "Politically Incorrect" board (/pol/) and Facebook's Hateful Memes Challenge dataset. We conduct three experiments to determine the importance of multimodality on classification performance, the influential capacity of fringe Web communities on mainstream social platforms and vice versa.
arXiv Detail & Related papers (2022-02-17T07:52:22Z)
An Interpretable Approach to Hateful Meme Detection [0.0]
Hateful memes are an emerging method of spreading hate on the internet. We take an interpretable approach to meme detection using machine learning. We build a gradient-boosted decision tree and an LSTM-based model that achieve comparable performance.
arXiv Detail & Related papers (2021-08-09T18:28:56Z)
Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset [47.65948529524281]
We collect hateful and non-hateful memes from Pinterest to evaluate out-of-sample performance on models pre-trained on the Facebook dataset. We find that memes in the wild differ in two key aspects: 1) Captions must be extracted via OCR, and 2) Memes are more diverse than traditional memes', including screenshots of conversations or text on a plain background.
arXiv Detail & Related papers (2021-07-09T09:04:05Z)
Multimodal Learning for Hateful Memes Detection [6.6881085567421605]
We propose a novel method that incorporates the image captioning process into the memes detection process. Our model achieves promising results on the Hateful Memes Detection Challenge.
arXiv Detail & Related papers (2020-11-25T16:49:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.