Related papers: MemeCap: A Dataset for Captioning and Interpreting Memes

MemeCap: A Dataset for Captioning and Interpreting Memes

URL: http://arxiv.org/abs/2305.13703v1
Date: Tue, 23 May 2023 05:41:18 GMT
Title: MemeCap: A Dataset for Captioning and Interpreting Memes
Authors: EunJeong Hwang and Vered Shwartz
Abstract summary: We present the task of meme captioning and release a new dataset, MemeCap. Our dataset contains 6.3K memes along with the title of the post containing the meme, the meme captions, the literal image caption, and the visual metaphors.
Score: 11.188548484391978
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Memes are a widely popular tool for web users to express their thoughts using visual metaphors. Understanding memes requires recognizing and interpreting visual metaphors with respect to the text inside or around the meme, often while employing background knowledge and reasoning abilities. We present the task of meme captioning and release a new dataset, MemeCap. Our dataset contains 6.3K memes along with the title of the post containing the meme, the meme captions, the literal image caption, and the visual metaphors. Despite the recent success of vision and language (VL) models on tasks such as image captioning and visual question answering, our extensive experiments using state-of-the-art VL models show that they still struggle with visual metaphors, and perform substantially worse than humans.

Related papers

Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes [5.243460995467895]
This study introduces ClassicMemes-50-templates (CM50), a large-scale dataset consisting of over 33,000 memes, centered around 50 popular meme templates. We also present an automated knowledge-grounded annotation pipeline leveraging large vision-language models to produce high-quality image captions, meme captions, and literary device labels.
arXiv Detail & Related papers (2025-01-23T17:18:30Z)
XMeCap: Meme Caption Generation with Sub-Image Adaptability [53.2509590113364]
Humor, deeply rooted in societal meanings and cultural details, poses a unique challenge for machines. We introduce the textscXMeCap framework, which adopts supervised fine-tuning and reinforcement learning. textscXMeCap achieves an average evaluation score of 75.85 for single-image memes and 66.32 for multi-image memes, outperforming the best baseline by 3.71% and 4.82%, respectively.
arXiv Detail & Related papers (2024-07-24T10:51:46Z)
What Makes a Meme a Meme? Identifying Memes for Memetics-Aware Dataset Creation [0.9217021281095907]
Multimodal Internet Memes are now a ubiquitous fixture in online discourse. Memetics are the process by which memes are imitated and transformed into symbols. We develop a meme identification protocol which distinguishes meme from non-memetic content by recognising the memetics within it.
arXiv Detail & Related papers (2024-07-16T15:48:36Z)
Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations [48.82168723932981]
We introduce em MultiBully-Ex, the first benchmark dataset for multimodal explanation from code-mixed cyberbullying memes. A Contrastive Language-Image Pretraining (CLIP) approach has been proposed for visual and textual explanation of a meme.
arXiv Detail & Related papers (2024-01-18T11:24:30Z)
A Template Is All You Meme [83.05919383106715]
We release a knowledge base of memes and information found on www.knowyourmeme.com, composed of more than 54,000 images. We hypothesize that meme templates can be used to inject models with the context missing from previous approaches.
arXiv Detail & Related papers (2023-11-11T19:38:14Z)
Multi-modal application: Image Memes Generation [13.043370069398916]
We propose an end-to-end encoder-decoder architecture meme generator. An Internet meme commonly takes the form of an image and is created by combining a meme template (image) and a caption (natural language sentence)
arXiv Detail & Related papers (2021-12-03T00:17:44Z)
Caption Enriched Samples for Improving Hateful Memes Detection [78.5136090997431]
The hateful meme challenge demonstrates the difficulty of determining whether a meme is hateful or not. Both unimodal language models and multimodal vision-language models cannot reach the human level of performance.
arXiv Detail & Related papers (2021-09-22T10:57:51Z)
Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset [47.65948529524281]
We collect hateful and non-hateful memes from Pinterest to evaluate out-of-sample performance on models pre-trained on the Facebook dataset. We find that memes in the wild differ in two key aspects: 1) Captions must be extracted via OCR, and 2) Memes are more diverse than traditional memes', including screenshots of conversations or text on a plain background.
arXiv Detail & Related papers (2021-07-09T09:04:05Z)
Entropy and complexity unveil the landscape of memes evolution [105.59074436693487]
We study the evolution of 2 million visual memes from Reddit over ten years, from 2011 to 2020. We find support for the hypothesis that memes are part of an emerging form of internet metalanguage.
arXiv Detail & Related papers (2021-05-26T07:41:09Z)
Multimodal Learning for Hateful Memes Detection [6.6881085567421605]
We propose a novel method that incorporates the image captioning process into the memes detection process. Our model achieves promising results on the Hateful Memes Detection Challenge.
arXiv Detail & Related papers (2020-11-25T16:49:15Z)
memeBot: Towards Automatic Image Meme Generation [24.37035046107127]
The model learns the dependencies between the meme captions and the meme template images and generates new memes. Experiments on Twitter data show the efficacy of the model in generating memes for sentences in online social interaction.
arXiv Detail & Related papers (2020-04-30T03:48:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.