Into the LAIONs Den: Investigating Hate in Multimodal Datasets
- URL: http://arxiv.org/abs/2311.03449v1
- Date: Mon, 6 Nov 2023 19:00:05 GMT
- Title: Into the LAIONs Den: Investigating Hate in Multimodal Datasets
- Authors: Abeba Birhane, Vinay Prabhu, Sang Han, Vishnu Naresh Boddeti,
Alexandra Sasha Luccioni
- Abstract summary: This paper investigates the effect of scaling datasets on hateful content through a comparative audit of two datasets: LAION-400M and LAION-2B.
We found that hate content increased by nearly 12% with dataset scale, measured both qualitatively and quantitatively.
We also found that filtering dataset contents based on Not Safe For Work (NSFW) values calculated based on images alone does not exclude all the harmful content in alt-text.
- Score: 67.21783778038645
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 'Scale the model, scale the data, scale the compute' is the reigning
sentiment in the world of generative AI today. While the impact of model
scaling has been extensively studied, we are only beginning to scratch the
surface of data scaling and its consequences. This is especially of critical
importance in the context of vision-language datasets such as LAION. These
datasets are continually growing in size and are built based on large-scale
internet dumps such as the Common Crawl, which is known to have numerous
drawbacks ranging from quality, legality, and content. The datasets then serve
as the backbone for large generative models, contributing to the
operationalization and perpetuation of harmful societal and historical biases
and stereotypes. In this paper, we investigate the effect of scaling datasets
on hateful content through a comparative audit of two datasets: LAION-400M and
LAION-2B. Our results show that hate content increased by nearly 12% with
dataset scale, measured both qualitatively and quantitatively using a metric
that we term as Hate Content Rate (HCR). We also found that filtering dataset
contents based on Not Safe For Work (NSFW) values calculated based on images
alone does not exclude all the harmful content in alt-text. Instead, we found
that trace amounts of hateful, targeted, and aggressive text remain even when
carrying out conservative filtering. We end with a reflection and a discussion
of the significance of our results for dataset curation and usage in the AI
community. Code and the meta-data assets curated in this paper are publicly
available at https://github.com/vinayprabhu/hate_scaling. Content warning: This
paper contains examples of hateful text that might be disturbing, distressing,
and/or offensive.
Related papers
- Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic [99.3682210827572]
Vision-language models (VLMs) are trained for thousands of GPU hours on carefully curated web datasets.
Data curation strategies are typically developed agnostic of the available compute for training.
We introduce neural scaling laws that account for the non-homogeneous nature of web data.
arXiv Detail & Related papers (2024-04-10T17:27:54Z) - On Hate Scaling Laws For Data-Swamps [14.891493485229251]
We show that the presence of hateful content in datasets, when measured with a Hate Content Rate (HCR) metric, increased by nearly $12%$.
As scale increased, the tendency of the model to associate images of human faces with the human being' class over 7 other offensive classes reduced by half.
For the Black female category, the tendency of the model to associate their faces with the criminal' class doubled, while quintupling for Black male faces.
arXiv Detail & Related papers (2023-06-22T18:00:17Z) - BERT-based Ensemble Approaches for Hate Speech Detection [1.8734449181723825]
This paper focuses on classifying hate speech in social media using multiple deep models.
We evaluated with several ensemble techniques, including soft voting, maximum value, hard voting and stacking.
Experiments have shown good results especially the ensemble models, where stacking gave F1 score of 97% on Davidson dataset and aggregating ensembles 77% on the DHO dataset.
arXiv Detail & Related papers (2022-09-14T09:08:24Z) - Detect Hate Speech in Unseen Domains using Multi-Task Learning: A Case
Study of Political Public Figures [7.52579126252489]
We propose a new Multi-task Learning pipeline that utilizes MTL to train simultaneously across multiple hate speech datasets.
We show strong results when examining generalization error in train-test splits and substantial improvements when predicting on previously unseen datasets.
We also assemble a novel dataset, dubbed PubFigs, focusing on the problematic speech of American Public Political Figures.
arXiv Detail & Related papers (2022-08-22T21:13:38Z) - Multimodal datasets: misogyny, pornography, and malignant stereotypes [2.8682942808330703]
We examine the recently released LAION-400M dataset, which is a CLIP-filtered dataset of Image-Alt-text pairs parsed from the Common-Crawl dataset.
We found that the dataset contains, troublesome and explicit images and text pairs of rape, pornography, malign stereotypes, racist and ethnic slurs, and other extremely problematic content.
arXiv Detail & Related papers (2021-10-05T11:47:27Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - Contemplating real-world object classification [53.10151901863263]
We reanalyze the ObjectNet dataset recently proposed by Barbu et al. containing objects in daily life situations.
We find that applying deep models to the isolated objects, rather than the entire scene as is done in the original paper, results in around 20-30% performance improvement.
arXiv Detail & Related papers (2021-03-08T23:29:59Z) - Trawling for Trolling: A Dataset [56.1778095945542]
We present a dataset that models trolling as a subcategory of offensive content.
The dataset has 12,490 samples, split across 5 classes; Normal, Profanity, Trolling, Derogatory and Hate Speech.
arXiv Detail & Related papers (2020-08-02T17:23:55Z) - REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets [64.76453161039973]
REVISE (REvealing VIsual biaSEs) is a tool that assists in the investigation of a visual dataset.
It surfacing potential biases along three dimensions: (1) object-based, (2) person-based, and (3) geography-based.
arXiv Detail & Related papers (2020-04-16T23:54:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.