Multilingual HateCheck: Functional Tests for Multilingual Hate Speech
Detection Models
- URL: http://arxiv.org/abs/2206.09917v1
- Date: Mon, 20 Jun 2022 17:54:39 GMT
- Title: Multilingual HateCheck: Functional Tests for Multilingual Hate Speech
Detection Models
- Authors: Paul R\"ottger, Haitham Seelawi, Debora Nozza, Zeerak Talat, Bertie
Vidgen
- Abstract summary: We introduce HateCheck (MHC), a suite of functional tests for multilingual hate speech detection models.
MHC covers 34 functionalities across ten languages, which is more languages than any other hate speech dataset.
We train and test a high-performing multilingual hate speech detection model, and reveal critical model weaknesses for monolingual and cross-lingual applications.
- Score: 14.128029444990895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hate speech detection models are typically evaluated on held-out test sets.
However, this risks painting an incomplete and potentially misleading picture
of model performance because of increasingly well-documented systematic gaps
and biases in hate speech datasets. To enable more targeted diagnostic
insights, recent research has thus introduced functional tests for hate speech
detection models. However, these tests currently only exist for
English-language content, which means that they cannot support the development
of more effective models in other languages spoken by billions across the
world. To help address this issue, we introduce Multilingual HateCheck (MHC), a
suite of functional tests for multilingual hate speech detection models. MHC
covers 34 functionalities across ten languages, which is more languages than
any other hate speech dataset. To illustrate MHC's utility, we train and test a
high-performing multilingual hate speech detection model, and reveal critical
model weaknesses for monolingual and cross-lingual applications.
Related papers
- Understanding and Mitigating Language Confusion in LLMs [76.96033035093204]
We evaluate 15 typologically diverse languages with existing and newly-created English and multilingual prompts.
We find that Llama Instruct and Mistral models exhibit high degrees of language confusion.
We find that language confusion can be partially mitigated via few-shot prompting, multilingual SFT and preference tuning.
arXiv Detail & Related papers (2024-06-28T17:03:51Z) - GPT-HateCheck: Can LLMs Write Better Functional Tests for Hate Speech Detection? [50.53312866647302]
HateCheck is a suite for testing fine-grained model functionalities on synthesized data.
We propose GPT-HateCheck, a framework to generate more diverse and realistic functional tests from scratch.
Crowd-sourced annotation demonstrates that the generated test cases are of high quality.
arXiv Detail & Related papers (2024-02-23T10:02:01Z) - Evaluating ChatGPT's Performance for Multilingual and Emoji-based Hate
Speech Detection [4.809236881780707]
Large language models like ChatGPT have recently shown a great promise in performing several tasks, including hate speech detection.
This study aims to evaluate the strengths and weaknesses of the ChatGPT model in detecting hate speech at a granular level across 11 languages.
arXiv Detail & Related papers (2023-05-22T17:36:58Z) - Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection [23.97444551607624]
Hate speech in social media is a growing phenomenon, and detecting such toxic content has gained significant traction.
HateMAML is a model-agnostic meta-learning-based framework that effectively performs hate speech detection in low-resource languages.
Extensive experiments are conducted on five datasets across eight different low-resource languages.
arXiv Detail & Related papers (2023-03-04T22:28:29Z) - M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for
Multilingual Speech to Image Retrieval [56.49878599920353]
This work investigates the use of large-scale, English-only pre-trained models (CLIP and HuBERT) for multilingual image-speech retrieval.
For non-English image-speech retrieval, we outperform the current state-of-the-art performance by a wide margin both when training separate models for each language, and with a single model which processes speech in all three languages.
arXiv Detail & Related papers (2022-11-02T14:54:45Z) - Lifting the Curse of Multilinguality by Pre-training Modular
Transformers [72.46919537293068]
multilingual pre-trained models suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages.
We introduce language-specific modules, which allows us to grow the total capacity of the model, while keeping the total number of trainable parameters per language constant.
Our approach enables adding languages post-hoc with no measurable drop in performance, no longer limiting the model usage to the set of pre-trained languages.
arXiv Detail & Related papers (2022-05-12T17:59:56Z) - HateCheckHIn: Evaluating Hindi Hate Speech Detection Models [6.52974752091861]
multilingual hate is a major emerging challenge for automated detection.
We introduce a set of functionalities for the purpose of evaluation.
Considering Hindi as a base language, we craft test cases for each functionality.
arXiv Detail & Related papers (2022-04-30T19:09:09Z) - Highly Generalizable Models for Multilingual Hate Speech Detection [0.0]
Hate speech detection has become an important research topic within the past decade.
We compile a dataset of 11 languages and resolve different by analyzing the combined data with binary labels: hate speech or not hate speech.
We conduct three types of experiments for a binary hate speech classification task: Multilingual-Train Monolingual-Test, MonolingualTrain Monolingual-Test and Language-Family-Train Monolingual Test scenarios.
arXiv Detail & Related papers (2022-01-27T03:09:38Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Learning to Scale Multilingual Representations for Vision-Language Tasks [51.27839182889422]
The effectiveness of SMALR is demonstrated with ten diverse languages, over twice the number supported in vision-language tasks to date.
We evaluate on multilingual image-sentence retrieval and outperform prior work by 3-4% with less than 1/5th the training parameters compared to other word embedding methods.
arXiv Detail & Related papers (2020-04-09T01:03:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.