Multimodal Hate Speech Detection from Bengali Memes and Texts
- URL: http://arxiv.org/abs/2204.10196v1
- Date: Tue, 19 Apr 2022 11:15:25 GMT
- Title: Multimodal Hate Speech Detection from Bengali Memes and Texts
- Authors: Md. Rezaul Karim and Sumon Kanti Dey and Tanhim Islam and Bharathi
Raja Chakravarthi
- Abstract summary: This paper is about hate speech detection from multimodal Bengali memes and texts.
We train several neural networks to analyze textual and visual information for hate speech detection.
Our study suggests that memes are moderately useful for hate speech detection in Bengali, but none of the multimodal models outperform unimodal models.
- Score: 0.6709991492637819
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Numerous works have been proposed to employ machine learning (ML) and deep
learning (DL) techniques to utilize textual data from social media for
anti-social behavior analysis such as cyberbullying, fake news propagation, and
hate speech mainly for highly resourced languages like English. However,
despite having a lot of diversity and millions of native speakers, some
languages such as Bengali are under-resourced, which is due to a lack of
computational resources for natural language processing (NLP). Like English,
Bengali social media content also includes images along with texts (e.g.,
multimodal contents are posted by embedding short texts into images on
Facebook), only the textual data is not enough to judge them (e.g., to
determine they are hate speech). In those cases, images might give extra
context to properly judge. This paper is about hate speech detection from
multimodal Bengali memes and texts. We prepared the only multimodal hate speech
detection dataset1 for a kind of problem for Bengali. We train several neural
architectures (i.e., neural networks like Bi-LSTM/Conv-LSTM with word
embeddings, EfficientNet + transformer architectures such as monolingual Bangla
BERT, multilingual BERT-cased/uncased, and XLM-RoBERTa) jointly analyze textual
and visual information for hate speech detection. The Conv-LSTM and XLM-RoBERTa
models performed best for texts, yielding F1 scores of 0.78 and 0.82,
respectively. As of memes, ResNet152 and DenseNet201 models yield F1 scores of
0.78 and 0.7, respectively. The multimodal fusion of mBERT-uncased +
EfficientNet-B1 performed the best, yielding an F1 score of 0.80. Our study
suggests that memes are moderately useful for hate speech detection in Bengali,
but none of the multimodal models outperform unimodal models analyzing only
textual data.
Related papers
- mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus [52.83121058429025]
We introduce mOSCAR, the first large-scale multilingual and multimodal document corpus crawled from the web.
It covers 163 languages, 315M documents, 214B tokens and 1.2B images.
It shows a strong boost in few-shot learning performance across various multilingual image-text tasks and benchmarks.
arXiv Detail & Related papers (2024-06-13T00:13:32Z) - OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text [112.60163342249682]
We introduce OmniCorpus, a 10 billion-scale image-text interleaved dataset.
Our dataset has 15 times larger scales while maintaining good data quality.
We hope this could provide a solid data foundation for future multimodal model research.
arXiv Detail & Related papers (2024-06-12T17:01:04Z) - Hate Speech and Offensive Content Detection in Indo-Aryan Languages: A
Battle of LSTM and Transformers [0.0]
We conduct a comparative analysis of hate speech classification across five distinct languages: Bengali, Assamese, Bodo, Sinhala, and Gujarati.
Bert Base Multilingual Cased emerges as a strong performer across languages, achieving an F1 score of 0.67027 for Bengali and 0.70525 for Assamese.
In Sinhala, XLM-R stands out with an F1 score of 0.83493, whereas for Gujarati, a custom LSTM-based model outshined with an F1 score of 0.76601.
arXiv Detail & Related papers (2023-12-09T20:24:00Z) - Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages [76.35234803589412]
MPM is an effective training paradigm for training large multimodal models in non-English languages.
We build large multimodal models VisCPM in image-to-text and text-to-image generation, which achieve state-of-the-art (open-source) performance in Chinese.
arXiv Detail & Related papers (2023-08-23T09:55:41Z) - BeAts: Bengali Speech Acts Recognition using Multimodal Attention Fusion [0.0]
We develop a novel approach combining two models, wav2vec2.0 for audio and MarianMT for text translation, to predict speech acts.
We also show that our model BeAts ($underlinetextbfBe$ngali speech acts recognition using Multimodal $underlinetextbfAt$tention Fu$underlinetextbfs$ion.
arXiv Detail & Related papers (2023-06-05T08:12:17Z) - Bangla hate speech detection on social media using attention-based
recurrent neural network [2.1349209400003932]
This article proposed encoder decoder based machine learning model, a popular tool in NLP, to classify user's Bengali comments on Facebook pages.
A dataset of 7,425 Bengali comments, consisting of seven distinct categories of hate speeches, was used to train and evaluate our model.
Among the three encoder decoder algorithms, the attention-based decoder obtained the best accuracy (77%)
arXiv Detail & Related papers (2022-03-31T03:31:53Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - Exploiting BERT For Multimodal Target SentimentClassification Through
Input Space Translation [75.82110684355979]
We introduce a two-stream model that translates images in input space using an object-aware transformer.
We then leverage the translation to construct an auxiliary sentence that provides multimodal information to a language model.
We achieve state-of-the-art performance on two multimodal Twitter datasets.
arXiv Detail & Related papers (2021-08-03T18:02:38Z) - DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced
Bengali Language [1.2246649738388389]
We propose an explainable approach for hate speech detection from the under-resourced Bengali language.
In our approach, Bengali texts are first comprehensively preprocessed, before classifying them into political, personal, geopolitical, and religious hates.
Evaluations against machine learning (linear and tree-based models) and deep neural networks (i.e., CNN, Bi-LSTM, and Conv-LSTM with word embeddings) baselines yield F1 scores of 84%, 90%, 88%, and 88%, for political, personal, geopolitical, and religious hates, respectively.
arXiv Detail & Related papers (2020-12-28T16:46:03Z) - Classification Benchmarks for Under-resourced Bengali Language based on
Multichannel Convolutional-LSTM Network [3.0168410626760034]
We build the largest Bengali word embedding models to date based on 250 million articles, which we call BengFastText.
We incorporate word embeddings into a Multichannel Convolutional-LSTM network for predicting different types of hate speech, document classification, and sentiment analysis.
arXiv Detail & Related papers (2020-04-11T22:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.