MasonPerplexity at Multimodal Hate Speech Event Detection 2024: Hate
Speech and Target Detection Using Transformer Ensembles
- URL: http://arxiv.org/abs/2402.01967v2
- Date: Sun, 18 Feb 2024 06:39:55 GMT
- Title: MasonPerplexity at Multimodal Hate Speech Event Detection 2024: Hate
Speech and Target Detection Using Transformer Ensembles
- Authors: Amrita Ganguly, Al Nahian Bin Emran, Sadiya Sayara Chowdhury Puspo, Md
Nishat Raihan, Dhiman Goswami, Marcos Zampieri
- Abstract summary: This paper presents the MasonPerplexity submission for the Shared Task on Multimodal Hate Speech Event Detection at CASE 2024 at EACL 2024.
We use an XLM-roBERTa-large model for sub-task A and an ensemble approach combining XLM-roBERTa-base, BERTweet-large, and BERT-base for sub-task B.
- Score: 6.2696956160552455
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The automatic identification of offensive language such as hate speech is
important to keep discussions civil in online communities. Identifying hate
speech in multimodal content is a particularly challenging task because
offensiveness can be manifested in either words or images or a juxtaposition of
the two. This paper presents the MasonPerplexity submission for the Shared Task
on Multimodal Hate Speech Event Detection at CASE 2024 at EACL 2024. The task
is divided into two sub-tasks: sub-task A focuses on the identification of hate
speech and sub-task B focuses on the identification of targets in text-embedded
images during political events. We use an XLM-roBERTa-large model for sub-task
A and an ensemble approach combining XLM-roBERTa-base, BERTweet-large, and
BERT-base for sub-task B. Our approach obtained 0.8347 F1-score in sub-task A
and 0.6741 F1-score in sub-task B ranking 3rd on both sub-tasks.
Related papers
- Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System [73.34663391495616]
We propose a pioneering approach to tackle joint multi-talker and target-talker speech recognition tasks.
Specifically, we freeze Whisper and plug a Sidecar separator into its encoder to separate mixed embedding for multiple talkers.
We deliver acceptable zero-shot performance on multi-talker ASR on AishellMix Mandarin dataset.
arXiv Detail & Related papers (2024-07-13T09:28:24Z) - SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine.
Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM.
Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z) - Lexical Squad@Multimodal Hate Speech Event Detection 2023: Multimodal
Hate Speech Detection using Fused Ensemble Approach [0.23020018305241333]
We present our novel ensemble learning approach for detecting hate speech, by classifying text-embedded images into two labels, namely "Hate Speech" and "No Hate Speech"
Our proposed ensemble model yielded promising results with 75.21 and 74.96 as accuracy and F-1 score (respectively)
arXiv Detail & Related papers (2023-09-23T12:06:05Z) - Learning Speech Representation From Contrastive Token-Acoustic
Pretraining [57.08426714676043]
We propose "Contrastive Token-Acoustic Pretraining (CTAP)", which uses two encoders to bring phoneme and speech into a joint multimodal space.
The proposed CTAP model is trained on 210k speech and phoneme pairs, achieving minimally-supervised TTS, VC, and ASR.
arXiv Detail & Related papers (2023-09-01T12:35:43Z) - ARC-NLP at Multimodal Hate Speech Event Detection 2023: Multimodal
Methods Boosted by Ensemble Learning, Syntactical and Entity Features [1.3190581566723918]
In the Russia-Ukraine war, both opposing factions heavily relied on text-embedded images as a vehicle for spreading propaganda and hate speech.
In this paper, we outline our methodologies for two subtasks of Multimodal Hate Speech Event Detection 2023.
For the first subtask, hate speech detection, we utilize multimodal deep learning models boosted by ensemble learning and syntactical text attributes.
For the second subtask, target detection, we employ multimodal deep learning models boosted by named entity features.
arXiv Detail & Related papers (2023-07-25T21:56:14Z) - SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
Tasks [88.4408774253634]
Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community.
There are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers.
Recent work has begun to introduce such benchmark for several tasks.
arXiv Detail & Related papers (2022-12-20T18:39:59Z) - Improving Multi-task Generalization Ability for Neural Text Matching via
Prompt Learning [54.66399120084227]
Recent state-of-the-art neural text matching models (PLMs) are hard to generalize to different tasks.
We adopt a specialization-generalization training strategy and refer to it as Match-Prompt.
In specialization stage, descriptions of different matching tasks are mapped to only a few prompt tokens.
In generalization stage, text matching model explores the essential matching signals by being trained on diverse multiple matching tasks.
arXiv Detail & Related papers (2022-04-06T11:01:08Z) - Leveraging Transformers for Hate Speech Detection in Conversational
Code-Mixed Tweets [36.29939722039909]
This paper describes the system proposed by team MIDAS-IIITD for HASOC 2021 subtask 2.
It is one of the first shared tasks focusing on detecting hate speech from Hindi-English code-mixed conversations on Twitter.
Our best performing system, a hard voting ensemble of Indic-BERT, XLM-RoBERTa, and Multilingual BERT, achieved a macro F1 score of 0.7253.
arXiv Detail & Related papers (2021-12-18T19:27:33Z) - Detection of Hate Speech using BERT and Hate Speech Word Embedding with
Deep Model [0.5801044612920815]
This paper investigates the feasibility of leveraging domain-specific word embedding in Bidirectional LSTM based deep model to automatically detect/classify hate speech.
The experiments showed that domainspecific word embedding with the Bidirectional LSTM based deep model achieved a 93% f1-score while BERT achieved up to 96% f1-score on a combined balanced dataset from available hate speech datasets.
arXiv Detail & Related papers (2021-11-02T11:42:54Z) - AngryBERT: Joint Learning Target and Emotion for Hate Speech Detection [5.649040805759824]
This paper proposes a novel multitask learning-based model, AngryBERT, which jointly learns hate speech detection with sentiment classification and target identification as secondary relevant tasks.
Experiment results show that AngryBERT outperforms state-of-the-art single-task-learning and multitask learning baselines.
arXiv Detail & Related papers (2021-03-14T16:17:26Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.