Related papers: Into the LAIONs Den: Investigating Hate in Multimodal Datasets

Into the LAIONs Den: Investigating Hate in Multimodal Datasets

URL: http://arxiv.org/abs/2311.03449v1
Date: Mon, 6 Nov 2023 19:00:05 GMT
Title: Into the LAIONs Den: Investigating Hate in Multimodal Datasets
Authors: Abeba Birhane, Vinay Prabhu, Sang Han, Vishnu Naresh Boddeti, Alexandra Sasha Luccioni
Abstract summary: This paper investigates the effect of scaling datasets on hateful content through a comparative audit of two datasets: LAION-400M and LAION-2B. We found that hate content increased by nearly 12% with dataset scale, measured both qualitatively and quantitatively. We also found that filtering dataset contents based on Not Safe For Work (NSFW) values calculated based on images alone does not exclude all the harmful content in alt-text.
Score: 67.21783778038645
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 'Scale the model, scale the data, scale the compute' is the reigning sentiment in the world of generative AI today. While the impact of model scaling has been extensively studied, we are only beginning to scratch the surface of data scaling and its consequences. This is especially of critical importance in the context of vision-language datasets such as LAION. These datasets are continually growing in size and are built based on large-scale internet dumps such as the Common Crawl, which is known to have numerous drawbacks ranging from quality, legality, and content. The datasets then serve as the backbone for large generative models, contributing to the operationalization and perpetuation of harmful societal and historical biases and stereotypes. In this paper, we investigate the effect of scaling datasets on hateful content through a comparative audit of two datasets: LAION-400M and LAION-2B. Our results show that hate content increased by nearly 12% with dataset scale, measured both qualitatively and quantitatively using a metric that we term as Hate Content Rate (HCR). We also found that filtering dataset contents based on Not Safe For Work (NSFW) values calculated based on images alone does not exclude all the harmful content in alt-text. Instead, we found that trace amounts of hateful, targeted, and aggressive text remain even when carrying out conservative filtering. We end with a reflection and a discussion of the significance of our results for dataset curation and usage in the AI community. Code and the meta-data assets curated in this paper are publicly available at https://github.com/vinayprabhu/hate_scaling. Content warning: This paper contains examples of hateful text that might be disturbing, distressing, and/or offensive.

Related papers

Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation [15.355814393928707]
We put forward a unified dataset tailored for social media content moderation across six sensitive categories. These include conflictual language, profanity, sexually explicit material, drug-related content, self-harm, and spam. Fine-tuning large language models on this novel dataset yields significant improvements in detection performance compared to open off-the-shelf models.
arXiv Detail & Related papers (2024-11-29T16:44:02Z)
The Empirical Impact of Data Sanitization on Language Models [1.1359551336076306]
This paper empirically analyzes the effects of data sanitization across several benchmark language-modeling tasks. Our results suggest that for some tasks such as sentiment analysis or entailment, the impact of redaction is quite low, typically around 1-5%. For tasks such as comprehension Q&A there is a big drop of >25% in performance observed in redacted queries as compared to the original.
arXiv Detail & Related papers (2024-11-08T21:22:37Z)
Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic [99.3682210827572]
Vision-language models (VLMs) are trained for thousands of GPU hours on carefully curated web datasets. Data curation strategies are typically developed agnostic of the available compute for training. We introduce neural scaling laws that account for the non-homogeneous nature of web data.
arXiv Detail & Related papers (2024-04-10T17:27:54Z)
On Hate Scaling Laws For Data-Swamps [14.891493485229251]
We show that the presence of hateful content in datasets, when measured with a Hate Content Rate (HCR) metric, increased by nearly $12%$. As scale increased, the tendency of the model to associate images of human faces with the human being' class over 7 other offensive classes reduced by half. For the Black female category, the tendency of the model to associate their faces with the criminal' class doubled, while quintupling for Black male faces.
arXiv Detail & Related papers (2023-06-22T18:00:17Z)
Detect Hate Speech in Unseen Domains using Multi-Task Learning: A Case Study of Political Public Figures [7.52579126252489]
We propose a new Multi-task Learning pipeline that utilizes MTL to train simultaneously across multiple hate speech datasets. We show strong results when examining generalization error in train-test splits and substantial improvements when predicting on previously unseen datasets. We also assemble a novel dataset, dubbed PubFigs, focusing on the problematic speech of American Public Political Figures.
arXiv Detail & Related papers (2022-08-22T21:13:38Z)
Multimodal datasets: misogyny, pornography, and malignant stereotypes [2.8682942808330703]
We examine the recently released LAION-400M dataset, which is a CLIP-filtered dataset of Image-Alt-text pairs parsed from the Common-Crawl dataset. We found that the dataset contains, troublesome and explicit images and text pairs of rape, pornography, malign stereotypes, racist and ethnic slurs, and other extremely problematic content.
arXiv Detail & Related papers (2021-10-05T11:47:27Z)
Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious. We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z)
Contemplating real-world object classification [53.10151901863263]
We reanalyze the ObjectNet dataset recently proposed by Barbu et al. containing objects in daily life situations. We find that applying deep models to the isolated objects, rather than the entire scene as is done in the original paper, results in around 20-30% performance improvement.
arXiv Detail & Related papers (2021-03-08T23:29:59Z)
Trawling for Trolling: A Dataset [56.1778095945542]
We present a dataset that models trolling as a subcategory of offensive content. The dataset has 12,490 samples, split across 5 classes; Normal, Profanity, Trolling, Derogatory and Hate Speech.
arXiv Detail & Related papers (2020-08-02T17:23:55Z)
REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets [64.76453161039973]
REVISE (REvealing VIsual biaSEs) is a tool that assists in the investigation of a visual dataset. It surfacing potential biases along three dimensions: (1) object-based, (2) person-based, and (3) geography-based.
arXiv Detail & Related papers (2020-04-16T23:54:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.