Detect Hate Speech in Unseen Domains using Multi-Task Learning: A Case
Study of Political Public Figures
- URL: http://arxiv.org/abs/2208.10598v1
- Date: Mon, 22 Aug 2022 21:13:38 GMT
- Title: Detect Hate Speech in Unseen Domains using Multi-Task Learning: A Case
Study of Political Public Figures
- Authors: Lanqin Yuan and Marian-Andrei Rizoiu
- Abstract summary: We propose a new Multi-task Learning pipeline that utilizes MTL to train simultaneously across multiple hate speech datasets.
We show strong results when examining generalization error in train-test splits and substantial improvements when predicting on previously unseen datasets.
We also assemble a novel dataset, dubbed PubFigs, focusing on the problematic speech of American Public Political Figures.
- Score: 7.52579126252489
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic identification of hateful and abusive content is vital in combating
the spread of harmful online content and its damaging effects. Most existing
works evaluate models by examining the generalization error on train-test
splits on hate speech datasets. These datasets often differ in their
definitions and labeling criteria, leading to poor model performance when
predicting across new domains and datasets. In this work, we propose a new
Multi-task Learning (MTL) pipeline that utilizes MTL to train simultaneously
across multiple hate speech datasets to construct a more encompassing
classification model. We simulate evaluation on new previously unseen datasets
by adopting a leave-one-out scheme in which we omit a target dataset from
training and jointly train on the other datasets. Our results consistently
outperform a large sample of existing work. We show strong results when
examining generalization error in train-test splits and substantial
improvements when predicting on previously unseen datasets. Furthermore, we
assemble a novel dataset, dubbed PubFigs, focusing on the problematic speech of
American Public Political Figures. We automatically detect problematic speech
in the $305,235$ tweets in PubFigs, and we uncover insights into the posting
behaviors of public figures.
Related papers
- Latent Feature-based Data Splits to Improve Generalisation Evaluation: A
Hate Speech Detection Case Study [33.1099258648462]
We present two split variants that reveal how models catastrophically fail on blind spots in the latent space.
Our analysis suggests that there is no clear surface-level property of the data split that correlates with the decreased performance.
arXiv Detail & Related papers (2023-11-16T23:49:55Z) - Into the LAIONs Den: Investigating Hate in Multimodal Datasets [67.21783778038645]
This paper investigates the effect of scaling datasets on hateful content through a comparative audit of two datasets: LAION-400M and LAION-2B.
We found that hate content increased by nearly 12% with dataset scale, measured both qualitatively and quantitatively.
We also found that filtering dataset contents based on Not Safe For Work (NSFW) values calculated based on images alone does not exclude all the harmful content in alt-text.
arXiv Detail & Related papers (2023-11-06T19:00:05Z) - LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection [10.014248704653]
This study investigates the effectiveness and adaptability of pre-trained and fine-tuned Large Language Models (LLMs) in identifying hate speech.
LLMs offer a huge advantage over the state-of-the-art even without pretraining.
We conclude with a vision for the future of hate speech detection, emphasizing cross-domain generalizability.
arXiv Detail & Related papers (2023-10-29T10:07:32Z) - Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical
Evaluation [5.16706940452805]
We perform a large-scale cross-dataset comparison where we fine-tune language models on different hate speech detection datasets.
This analysis shows how some datasets are more generalisable than others when used as training data.
Experiments show how combining hate speech detection datasets can contribute to the development of robust hate speech detection models.
arXiv Detail & Related papers (2023-07-04T12:22:40Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - The Change that Matters in Discourse Parsing: Estimating the Impact of
Domain Shift on Parser Error [14.566990078034241]
We use a statistic from the theoretical domain adaptation literature which can be directly tied to error-gap.
We study the bias of this statistic as an estimator of error-gap both theoretically and through a large-scale empirical study of over 2400 experiments on 6 discourse datasets.
arXiv Detail & Related papers (2022-03-21T20:04:23Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - Unpaired Referring Expression Grounding via Bidirectional Cross-Modal
Matching [53.27673119360868]
Referring expression grounding is an important and challenging task in computer vision.
We propose a novel bidirectional cross-modal matching (BiCM) framework to address these challenges.
Our framework outperforms previous works by 6.55% and 9.94% on two popular grounding datasets.
arXiv Detail & Related papers (2022-01-18T01:13:19Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - On the Efficacy of Adversarial Data Collection for Question Answering:
Results from a Large-Scale Randomized Study [65.17429512679695]
In adversarial data collection (ADC), a human workforce interacts with a model in real time, attempting to produce examples that elicit incorrect predictions.
Despite ADC's intuitive appeal, it remains unclear when training on adversarial datasets produces more robust models.
arXiv Detail & Related papers (2021-06-02T00:48:33Z) - Leveraging Multi-domain, Heterogeneous Data using Deep Multitask
Learning for Hate Speech Detection [21.410160004193916]
We propose a Convolution Neural Network based multi-task learning models (MTLs)footnotecode to leverage information from multiple sources.
Empirical analysis performed on three benchmark datasets shows the efficacy of the proposed approach.
arXiv Detail & Related papers (2021-03-23T09:31:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.