Related papers: A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities

A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities

URL: http://arxiv.org/abs/2412.04942v1
Date: Fri, 06 Dec 2024 11:00:05 GMT
Title: A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities
Authors: Haotian Ye, Axel Wisiorek, Antonis Maronikolakis, Özge Alaçam, Hinrich Schütze,
Abstract summary: Hate speech online remains an understudied issue for marginalized communities.<n>In this paper, we aim to provide marginalized communities living in societies where the dominant language is low-resource with a privacy-preserving tool to protect themselves from hate speech on the internet.
Score: 43.37824420609252
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hate speech online remains an understudied issue for marginalized communities, and has seen rising relevance, especially in the Global South, which includes developing societies with increasing internet penetration. In this paper, we aim to provide marginalized communities living in societies where the dominant language is low-resource with a privacy-preserving tool to protect themselves from hate speech on the internet by filtering offensive content in their native languages. Our contribution in this paper is twofold: 1) we release REACT (REsponsive hate speech datasets Across ConTexts), a collection of high-quality, culture-specific hate speech detection datasets comprising seven distinct target groups in eight low-resource languages, curated by experienced data collectors; 2) we propose a solution to few-shot hate speech detection utilizing federated learning (FL), a privacy-preserving and collaborative learning approach, to continuously improve a central model that exhibits robustness when tackling different target groups and languages. By keeping the training local to the users' devices, we ensure the privacy of the users' data while benefitting from the efficiency of federated learning. Furthermore, we personalize client models to target-specific training data and evaluate their performance. Our results indicate the effectiveness of FL across different target groups, whereas the benefits of personalization on few-shot learning are not clear.

Related papers

MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning [56.79292318645454]
Large Language Models (LLMs) are susceptible to adversarial attacks such as jailbreaking. This vulnerability is exacerbated in multilingual setting, where multilingual safety-aligned data are often limited. We propose an approach to build a multilingual guardrail with reasoning.
arXiv Detail & Related papers (2025-04-21T17:15:06Z)
AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages [12.038482067686544]
AfriHate is a collection of hate speech and abusive language datasets in 15 African languages. Each instance in AfriHate is annotated by native speakers familiar with the local culture.
arXiv Detail & Related papers (2025-01-14T18:00:07Z)
A Survey on Automatic Online Hate Speech Detection in Low-Resource Languages [0.5825410941577593]
Social media and easy accessibility of the internet has facilitated the spread of hate speech.<n>This article provides a detailed survey of hate speech detection in low-resource languages around the world.
arXiv Detail & Related papers (2024-11-28T09:42:53Z)
A Federated Learning Approach to Privacy Preserving Offensive Language Identification [14.487531876937247]
We propose a privacy preserving architecture for identifying offensive language online by introducing Federated Learning (FL) FL is a decentralized architecture that allows multiple models to be trained locally without the need for data sharing. We trained multiple deep learning models on four publicly available English benchmark datasets.
arXiv Detail & Related papers (2024-04-17T15:23:12Z)
Target Span Detection for Implicit Harmful Content [18.84674403712032]
We focus on identifying implied targets of hate speech, essential for recognizing subtler hate speech and enhancing the detection of harmful content on digital platforms. We collect and annotate target spans in three prominent implicit hate speech datasets: SBIC, DynaHate, and IHC. Our experiments indicate that Implicit-Target-Span provides a challenging test bed for target span detection methods.
arXiv Detail & Related papers (2024-03-28T21:15:15Z)
Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks [66.78640306687227]
To protect privacy and meet legal regulations, federated learning (FL) has gained significant attention for training speech-to-text (S2T) systems. The commonly used FL approach (i.e., textscFedAvg) in S2T tasks typically suffers from extensive communication overhead. We propose a personalized federated S2T framework that introduces textscFedLoRA, a lightweight LoRA module for client-side tuning and interaction with the server, and textscFedMem, a global model equipped with a $k$-near
arXiv Detail & Related papers (2024-01-18T15:39:38Z)
MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection [2.433983268807517]
Hate speech poses significant social, psychological, and occasionally physical threats to targeted individuals and communities. Current computational linguistic approaches for tackling this phenomenon rely on labelled social media datasets for training. We scrutinized over 60 datasets, selectively integrating those pertinent into MetaHate. Our findings contribute to a deeper understanding of the existing datasets, paving the way for training more robust and adaptable models.
arXiv Detail & Related papers (2024-01-12T11:54:53Z)
Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z)
Countering Malicious Content Moderation Evasion in Online Social Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems. This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z)
Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages [35.185808055004344]
Most hate speech datasets so far focus on English-language content. More data is needed, but annotating hateful content is expensive, time-consuming and potentially harmful to annotators. We explore data-efficient strategies for expanding hate speech detection into under-resourced languages.
arXiv Detail & Related papers (2022-10-20T15:49:00Z)
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw. At the heart of the approach is a single multilingual token-free Charformer model. We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z)
COLD: A Benchmark for Chinese Offensive Language Detection [54.60909500459201]
We use COLDataset, a Chinese offensive language dataset with 37k annotated sentences. We also propose textscCOLDetector to study output offensiveness of popular Chinese language models. Our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models.
arXiv Detail & Related papers (2022-01-16T11:47:23Z)
Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages. We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language. We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z)
An Online Multilingual Hate speech Recognition System [13.87667165678441]
We analyse six datasets by combining them into a single homogeneous dataset and classify them into three classes, abusive, hateful or neither. We create a tool which identifies and scores a page with effective metric in near-real time and uses the same as feedback to re-train our model. We prove the competitive performance of our multilingual model on two langauges, English and Hindi, leading to comparable or superior performance to most monolingual models.
arXiv Detail & Related papers (2020-11-23T16:33:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.