Memes in the Wild: Assessing the Generalizability of the Hateful Memes
Challenge Dataset
- URL: http://arxiv.org/abs/2107.04313v1
- Date: Fri, 9 Jul 2021 09:04:05 GMT
- Title: Memes in the Wild: Assessing the Generalizability of the Hateful Memes
Challenge Dataset
- Authors: Hannah Rose Kirk, Yennie Jun, Paulius Rauba, Gal Wachtel, Ruining Li,
Xingjian Bai, Noah Broestl, Martin Doff-Sotta, Aleksandar Shtedritski, Yuki
M. Asano
- Abstract summary: We collect hateful and non-hateful memes from Pinterest to evaluate out-of-sample performance on models pre-trained on the Facebook dataset.
We find that memes in the wild differ in two key aspects: 1) Captions must be extracted via OCR, and 2) Memes are more diverse than traditional memes', including screenshots of conversations or text on a plain background.
- Score: 47.65948529524281
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hateful memes pose a unique challenge for current machine learning systems
because their message is derived from both text- and visual-modalities. To this
effect, Facebook released the Hateful Memes Challenge, a dataset of memes with
pre-extracted text captions, but it is unclear whether these synthetic examples
generalize to `memes in the wild'. In this paper, we collect hateful and
non-hateful memes from Pinterest to evaluate out-of-sample performance on
models pre-trained on the Facebook dataset. We find that memes in the wild
differ in two key aspects: 1) Captions must be extracted via OCR, injecting
noise and diminishing performance of multimodal models, and 2) Memes are more
diverse than `traditional memes', including screenshots of conversations or
text on a plain background. This paper thus serves as a reality check for the
current benchmark of hateful meme detection and its applicability for detecting
real world hate.
Related papers
- XMeCap: Meme Caption Generation with Sub-Image Adaptability [53.2509590113364]
Humor, deeply rooted in societal meanings and cultural details, poses a unique challenge for machines.
We introduce the textscXMeCap framework, which adopts supervised fine-tuning and reinforcement learning.
textscXMeCap achieves an average evaluation score of 75.85 for single-image memes and 66.32 for multi-image memes, outperforming the best baseline by 3.71% and 4.82%, respectively.
arXiv Detail & Related papers (2024-07-24T10:51:46Z) - What Makes a Meme a Meme? Identifying Memes for Memetics-Aware Dataset Creation [0.9217021281095907]
Multimodal Internet Memes are now a ubiquitous fixture in online discourse.
Memetics are the process by which memes are imitated and transformed into symbols.
We develop a meme identification protocol which distinguishes meme from non-memetic content by recognising the memetics within it.
arXiv Detail & Related papers (2024-07-16T15:48:36Z) - Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes
Through Multimodal Explanations [48.82168723932981]
We introduce em MultiBully-Ex, the first benchmark dataset for multimodal explanation from code-mixed cyberbullying memes.
A Contrastive Language-Image Pretraining (CLIP) approach has been proposed for visual and textual explanation of a meme.
arXiv Detail & Related papers (2024-01-18T11:24:30Z) - A Template Is All You Meme [83.05919383106715]
We release a knowledge base of memes and information found on www.knowyourmeme.com, composed of more than 54,000 images.
We hypothesize that meme templates can be used to inject models with the context missing from previous approaches.
arXiv Detail & Related papers (2023-11-11T19:38:14Z) - Mapping Memes to Words for Multimodal Hateful Meme Classification [26.101116761577796]
Some memes take a malicious turn, promoting hateful content and perpetuating discrimination.
We propose a novel approach named ISSUES for multimodal hateful meme classification.
Our method achieves state-of-the-art results on the Hateful Memes Challenge and HarMeme datasets.
arXiv Detail & Related papers (2023-10-12T14:38:52Z) - DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally
Spreading Out Disinformation [72.18912216025029]
We present DisinfoMeme to help detect disinformation memes.
The dataset contains memes mined from Reddit covering three current topics: the COVID-19 pandemic, the Black Lives Matter movement, and veganism/vegetarianism.
arXiv Detail & Related papers (2022-05-25T09:54:59Z) - Detecting and Understanding Harmful Memes: A Survey [48.135415967633676]
We offer a comprehensive survey with a focus on harmful memes.
One interesting finding is that many types of harmful memes are not really studied, e.g., such featuring self-harm and extremism.
Another observation is that memes can propagate globally through repackaging in different languages and that they can also be multilingual.
arXiv Detail & Related papers (2022-05-09T13:43:27Z) - Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the
Lens of Facebook's Challenge [10.775419935941008]
We assess the efficacy of current state-of-the-art multimodal machine learning models toward hateful meme detection.
We use two benchmark datasets comprising 12,140 and 10,567 images from 4chan's "Politically Incorrect" board (/pol/) and Facebook's Hateful Memes Challenge dataset.
We conduct three experiments to determine the importance of multimodality on classification performance, the influential capacity of fringe Web communities on mainstream social platforms and vice versa.
arXiv Detail & Related papers (2022-02-17T07:52:22Z) - Detecting Harmful Memes and Their Targets [27.25262711136056]
We present HarMeme, the first benchmark dataset, containing 3,544 memes related to COVID-19.
In the first stage, we labeled a meme as very harmful, partially harmful, or harmless; in the second stage, we further annotated the type of target(s) that each harmful meme points to.
The evaluation results using ten unimodal and multimodal models highlight the importance of using multimodal signals for both tasks.
arXiv Detail & Related papers (2021-09-24T17:11:42Z) - Multimodal Learning for Hateful Memes Detection [6.6881085567421605]
We propose a novel method that incorporates the image captioning process into the memes detection process.
Our model achieves promising results on the Hateful Memes Detection Challenge.
arXiv Detail & Related papers (2020-11-25T16:49:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.