The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
- URL: http://arxiv.org/abs/2005.04790v3
- Date: Wed, 7 Apr 2021 18:43:54 GMT
- Title: The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
- Authors: Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet
Singh, Pratik Ringshia, Davide Testuggine
- Abstract summary: This work proposes a new challenge set for multimodal classification, focusing on detecting hate speech in multimodal memes.
It is constructed such that unimodal models struggle and only multimodal models can succeed.
We find that state-of-the-art methods perform poorly compared to humans.
- Score: 43.778346545763654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work proposes a new challenge set for multimodal classification,
focusing on detecting hate speech in multimodal memes. It is constructed such
that unimodal models struggle and only multimodal models can succeed: difficult
examples ("benign confounders") are added to the dataset to make it hard to
rely on unimodal signals. The task requires subtle reasoning, yet is
straightforward to evaluate as a binary classification problem. We provide
baseline performance numbers for unimodal models, as well as for multimodal
models with various degrees of sophistication. We find that state-of-the-art
methods perform poorly compared to humans (64.73% vs. 84.7% accuracy),
illustrating the difficulty of the task and highlighting the challenge that
this important problem poses to the community.
Related papers
- Multi-modal Crowd Counting via a Broker Modality [64.5356816448361]
Multi-modal crowd counting involves estimating crowd density from both visual and thermal/depth images.
We propose a novel approach by introducing an auxiliary broker modality and frame the task as a triple-modal learning problem.
We devise a fusion-based method to generate this broker modality, leveraging a non-diffusion, lightweight counterpart of modern denoising diffusion-based fusion models.
arXiv Detail & Related papers (2024-07-10T10:13:11Z) - Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition [52.522244807811894]
We propose a novel multimodal Transformer framework using prompt learning to address the issue of missing modalities.
Our method introduces three types of prompts: generative prompts, missing-signal prompts, and missing-type prompts.
Through prompt learning, we achieve a substantial reduction in the number of trainable parameters.
arXiv Detail & Related papers (2024-07-07T13:55:56Z) - Generative Multimodal Models are In-Context Learners [60.50927925426832]
We introduce Emu2, a generative multimodal model with 37 billion parameters, trained on large-scale multimodal sequences.
Emu2 exhibits strong multimodal in-context learning abilities, even emerging to solve tasks that require on-the-fly reasoning.
arXiv Detail & Related papers (2023-12-20T18:59:58Z) - MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction Experts [92.76662894585809]
We introduce an approach to enhance multimodal models, which we call Multimodal Mixtures of Experts (MMoE)
MMoE is able to be applied to various types of models to gain improvement.
arXiv Detail & Related papers (2023-11-16T05:31:21Z) - Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content
Dilutions [27.983902791798965]
We develop a model that generates dilution text that maintains relevance and topical coherence with the image and existing text.
We find that the performance of task-specific fusion-based multimodal classifiers drops by 23.3% and 22.5%, respectively, in the presence of dilutions generated by our model.
Our work aims to highlight and encourage further research on the robustness of deep multimodal models to realistic variations.
arXiv Detail & Related papers (2022-11-04T17:58:02Z) - Deciphering Implicit Hate: Evaluating Automated Detection Algorithms for
Multimodal Hate [2.68137173219451]
This paper evaluates the role of semantic and multimodal context for detecting implicit and explicit hate.
We show that both text- and visual- enrichment improves model performance.
We find that all models perform better on content with full annotator agreement and that multimodal models are best at classifying the content where annotators disagree.
arXiv Detail & Related papers (2021-06-10T16:29:42Z) - Detecting Hate Speech in Multi-modal Memes [14.036769355498546]
We focus on hate speech detection in multi-modal memes wherein memes pose an interesting multi-modal fusion problem.
We aim to solve the Facebook Meme Challenge citekiela 2020hateful which aims to solve a binary classification problem of predicting whether a meme is hateful or not.
arXiv Detail & Related papers (2020-12-29T18:30:00Z) - A Multimodal Framework for the Detection of Hateful Memes [16.7604156703965]
We aim to develop a framework for the detection of hateful memes.
We show the effectiveness of upsampling of contrastive examples to encourage multimodality and ensemble learning.
Our best approach comprises an ensemble of UNITER-based models and achieves an AUROC score of 80.53, placing us 4th on phase 2 of the 2020 Hateful Memes Challenge organized by Facebook.
arXiv Detail & Related papers (2020-12-23T18:37:11Z) - Classification of Multimodal Hate Speech -- The Winning Solution of
Hateful Memes Challenge [0.0]
Hateful Memes is a new challenge set for multimodal classification.
Difficult examples are added to the dataset to make it hard to rely on unimodal signals.
I propose a new model that combined multimodal with rules, which achieve the first ranking of accuracy and AUROC of 86.8% and 0.923 respectively.
arXiv Detail & Related papers (2020-12-02T07:38:26Z) - ManyModalQA: Modality Disambiguation and QA over Diverse Inputs [73.93607719921945]
We present a new multimodal question answering challenge, ManyModalQA, in which an agent must answer a question by considering three distinct modalities.
We collect our data by scraping Wikipedia and then utilize crowdsourcing to collect question-answer pairs.
arXiv Detail & Related papers (2020-01-22T14:39:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.