MemeCap: A Dataset for Captioning and Interpreting Memes
- URL: http://arxiv.org/abs/2305.13703v1
- Date: Tue, 23 May 2023 05:41:18 GMT
- Title: MemeCap: A Dataset for Captioning and Interpreting Memes
- Authors: EunJeong Hwang and Vered Shwartz
- Abstract summary: We present the task of meme captioning and release a new dataset, MemeCap.
Our dataset contains 6.3K memes along with the title of the post containing the meme, the meme captions, the literal image caption, and the visual metaphors.
- Score: 11.188548484391978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Memes are a widely popular tool for web users to express their thoughts using
visual metaphors. Understanding memes requires recognizing and interpreting
visual metaphors with respect to the text inside or around the meme, often
while employing background knowledge and reasoning abilities. We present the
task of meme captioning and release a new dataset, MemeCap. Our dataset
contains 6.3K memes along with the title of the post containing the meme, the
meme captions, the literal image caption, and the visual metaphors. Despite the
recent success of vision and language (VL) models on tasks such as image
captioning and visual question answering, our extensive experiments using
state-of-the-art VL models show that they still struggle with visual metaphors,
and perform substantially worse than humans.
Related papers
- Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes [5.243460995467895]
This study introduces ClassicMemes-50-templates (CM50), a large-scale dataset consisting of over 33,000 memes, centered around 50 popular meme templates.
We also present an automated knowledge-grounded annotation pipeline leveraging large vision-language models to produce high-quality image captions, meme captions, and literary device labels.
arXiv Detail & Related papers (2025-01-23T17:18:30Z) - XMeCap: Meme Caption Generation with Sub-Image Adaptability [53.2509590113364]
Humor, deeply rooted in societal meanings and cultural details, poses a unique challenge for machines.
We introduce the textscXMeCap framework, which adopts supervised fine-tuning and reinforcement learning.
textscXMeCap achieves an average evaluation score of 75.85 for single-image memes and 66.32 for multi-image memes, outperforming the best baseline by 3.71% and 4.82%, respectively.
arXiv Detail & Related papers (2024-07-24T10:51:46Z) - What Makes a Meme a Meme? Identifying Memes for Memetics-Aware Dataset Creation [0.9217021281095907]
Multimodal Internet Memes are now a ubiquitous fixture in online discourse.
Memetics are the process by which memes are imitated and transformed into symbols.
We develop a meme identification protocol which distinguishes meme from non-memetic content by recognising the memetics within it.
arXiv Detail & Related papers (2024-07-16T15:48:36Z) - Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes
Through Multimodal Explanations [48.82168723932981]
We introduce em MultiBully-Ex, the first benchmark dataset for multimodal explanation from code-mixed cyberbullying memes.
A Contrastive Language-Image Pretraining (CLIP) approach has been proposed for visual and textual explanation of a meme.
arXiv Detail & Related papers (2024-01-18T11:24:30Z) - A Template Is All You Meme [76.03172165923058]
We create a knowledge base composed of more than 5,200 meme templates, information about them, and 54,000 examples of template instances.
To investigate the semantic signal of meme templates, we show that we can match memes in datasets to base templates contained in our knowledge base with a distance-based lookup.
Our examination of meme templates results in state-of-the-art performance for every dataset we consider, paving the way for analysis grounded in templateness.
arXiv Detail & Related papers (2023-11-11T19:38:14Z) - Multi-modal application: Image Memes Generation [13.043370069398916]
We propose an end-to-end encoder-decoder architecture meme generator.
An Internet meme commonly takes the form of an image and is created by combining a meme template (image) and a caption (natural language sentence)
arXiv Detail & Related papers (2021-12-03T00:17:44Z) - Caption Enriched Samples for Improving Hateful Memes Detection [78.5136090997431]
The hateful meme challenge demonstrates the difficulty of determining whether a meme is hateful or not.
Both unimodal language models and multimodal vision-language models cannot reach the human level of performance.
arXiv Detail & Related papers (2021-09-22T10:57:51Z) - Memes in the Wild: Assessing the Generalizability of the Hateful Memes
Challenge Dataset [47.65948529524281]
We collect hateful and non-hateful memes from Pinterest to evaluate out-of-sample performance on models pre-trained on the Facebook dataset.
We find that memes in the wild differ in two key aspects: 1) Captions must be extracted via OCR, and 2) Memes are more diverse than traditional memes', including screenshots of conversations or text on a plain background.
arXiv Detail & Related papers (2021-07-09T09:04:05Z) - Entropy and complexity unveil the landscape of memes evolution [105.59074436693487]
We study the evolution of 2 million visual memes from Reddit over ten years, from 2011 to 2020.
We find support for the hypothesis that memes are part of an emerging form of internet metalanguage.
arXiv Detail & Related papers (2021-05-26T07:41:09Z) - Multimodal Learning for Hateful Memes Detection [6.6881085567421605]
We propose a novel method that incorporates the image captioning process into the memes detection process.
Our model achieves promising results on the Hateful Memes Detection Challenge.
arXiv Detail & Related papers (2020-11-25T16:49:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.