Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection
- URL: http://arxiv.org/abs/2407.21004v1
- Date: Tue, 30 Jul 2024 17:51:44 GMT
- Title: Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection
- Authors: Jinfa Huang, Jinsheng Pan, Zhongwei Wan, Hanjia Lyu, Jiebo Luo,
- Abstract summary: We explore the potential of Large Multimodal Models (LMMs) for hateful meme detection.
We propose Evolver, which incorporates LMMs via Chain-of-Evolution (CoE) Prompting.
Evolver simulates the evolving and expressing process of memes and reasons through LMMs in a step-by-step manner.
- Score: 49.122777764853055
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances show that two-stream approaches have achieved outstanding performance in hateful meme detection. However, hateful memes constantly evolve as new memes emerge by fusing progressive cultural ideas, making existing methods obsolete or ineffective. In this work, we explore the potential of Large Multimodal Models (LMMs) for hateful meme detection. To this end, we propose Evolver, which incorporates LMMs via Chain-of-Evolution (CoE) Prompting, by integrating the evolution attribute and in-context information of memes. Specifically, Evolver simulates the evolving and expressing process of memes and reasons through LMMs in a step-by-step manner. First, an evolutionary pair mining module retrieves the top-k most similar memes in the external curated meme set with the input meme. Second, an evolutionary information extractor is designed to summarize the semantic regularities between the paired memes for prompting. Finally, a contextual relevance amplifier enhances the in-context hatefulness information to boost the search for evolutionary processes. Extensive experiments on public FHM, MAMI, and HarM datasets show that CoE prompting can be incorporated into existing LMMs to improve their performance. More encouragingly, it can serve as an interpretive tool to promote the understanding of the evolution of social memes.
Related papers
- MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing [53.30190591805432]
We introduce MemeMQA, a multimodal question-answering framework to solicit accurate responses to structured questions.
We also propose ARSENAL, a novel two-stage multimodal framework to address MemeMQA.
arXiv Detail & Related papers (2024-05-18T07:44:41Z) - Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes
Through Multimodal Explanations [48.82168723932981]
We introduce em MultiBully-Ex, the first benchmark dataset for multimodal explanation from code-mixed cyberbullying memes.
A Contrastive Language-Image Pretraining (CLIP) approach has been proposed for visual and textual explanation of a meme.
arXiv Detail & Related papers (2024-01-18T11:24:30Z) - GOAT-Bench: Safety Insights to Large Multimodal Models through
Meme-Based Social Abuse [15.632755242069729]
We introduce the comprehensive meme benchmark, GOAT-Bench, comprising over 6K varied memes encapsulating themes such as implicit hate speech, cyberbullying, and sexism, etc.
We delve into the ability of LMMs to accurately assess hatefulness, misogyny, offensiveness, sarcasm, and harmful content.
Our extensive experiments across a range of LMMs reveal that current models still exhibit a deficiency in safety awareness, showing insensitivity to various forms of implicit abuse.
arXiv Detail & Related papers (2024-01-03T03:28:55Z) - Improving Hateful Meme Detection through Retrieval-Guided Contrastive Learning [13.690436954062015]
We propose constructing a hatefulness-aware embedding space through retrieval-guided contrastive training.
Our approach achieves state-of-the-art performance on the HatefulMemes dataset with an AUROC of 87.0, outperforming much larger fine-tuned large multimodal models.
arXiv Detail & Related papers (2023-11-14T12:14:54Z) - MEMEX: Detecting Explanatory Evidence for Memes via Knowledge-Enriched
Contextualization [31.209594252045566]
We propose a novel task, MEMEX, given a meme and a related document, the aim is to mine the context that succinctly explains the background of the meme.
To benchmark MCC, we propose MIME, a multimodal neural framework that uses common sense enriched meme representation and a layered approach to capture the cross-modal semantic dependencies between the meme and the context.
arXiv Detail & Related papers (2023-05-25T10:19:35Z) - DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally
Spreading Out Disinformation [72.18912216025029]
We present DisinfoMeme to help detect disinformation memes.
The dataset contains memes mined from Reddit covering three current topics: the COVID-19 pandemic, the Black Lives Matter movement, and veganism/vegetarianism.
arXiv Detail & Related papers (2022-05-25T09:54:59Z) - Memes in the Wild: Assessing the Generalizability of the Hateful Memes
Challenge Dataset [47.65948529524281]
We collect hateful and non-hateful memes from Pinterest to evaluate out-of-sample performance on models pre-trained on the Facebook dataset.
We find that memes in the wild differ in two key aspects: 1) Captions must be extracted via OCR, and 2) Memes are more diverse than traditional memes', including screenshots of conversations or text on a plain background.
arXiv Detail & Related papers (2021-07-09T09:04:05Z) - Entropy and complexity unveil the landscape of memes evolution [105.59074436693487]
We study the evolution of 2 million visual memes from Reddit over ten years, from 2011 to 2020.
We find support for the hypothesis that memes are part of an emerging form of internet metalanguage.
arXiv Detail & Related papers (2021-05-26T07:41:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.