Centering the Margins: Outlier-Based Identification of Harmed
Populations in Toxicity Detection
- URL: http://arxiv.org/abs/2305.14735v3
- Date: Fri, 1 Dec 2023 19:45:25 GMT
- Title: Centering the Margins: Outlier-Based Identification of Harmed
Populations in Toxicity Detection
- Authors: Vyoma Raman, Eve Fleisig, Dan Klein
- Abstract summary: We operationalize the "margins" of a dataset by employing outlier detection to identify text about people with demographic attributes distant from the "norm"
We find that model performance is consistently worse for demographic outliers, with mean squared error (MSE) between outliers and non-outliers up to 70.4% worse across toxicity types.
It is also worse for text outliers, with a MSE up to 68.4% higher for outliers than non-outliers.
- Score: 40.70358114333233
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The impact of AI models on marginalized communities has traditionally been
measured by identifying performance differences between specified demographic
subgroups. Though this approach aims to center vulnerable groups, it risks
obscuring patterns of harm faced by intersectional subgroups or shared across
multiple groups. To address this, we draw on theories of marginalization from
disability studies and related disciplines, which state that people farther
from the norm face greater adversity, to consider the "margins" in the domain
of toxicity detection. We operationalize the "margins" of a dataset by
employing outlier detection to identify text about people with demographic
attributes distant from the "norm". We find that model performance is
consistently worse for demographic outliers, with mean squared error (MSE)
between outliers and non-outliers up to 70.4% worse across toxicity types. It
is also worse for text outliers, with a MSE up to 68.4% higher for outliers
than non-outliers. We also find text and demographic outliers to be
particularly susceptible to errors in the classification of severe toxicity and
identity attacks. Compared to analysis of disparities using traditional
demographic breakdowns, we find that our outlier analysis frequently surfaces
greater harms faced by a larger, more intersectional group, which suggests that
outlier analysis is particularly beneficial for identifying harms against those
groups.
Related papers
- Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness [10.194622474615462]
Large language models (LLMs) are known to exhibit demographic biases, yet few studies systematically evaluate these biases across multiple datasets or account for confounding factors.
Our findings reveal that while demographic traits, particularly race, influence alignment, these effects are inconsistent across datasets and often entangled with other factors.
arXiv Detail & Related papers (2024-11-13T19:08:23Z) - Regularized Contrastive Partial Multi-view Outlier Detection [76.77036536484114]
We propose a novel method named Regularized Contrastive Partial Multi-view Outlier Detection (RCPMOD)
In this framework, we utilize contrastive learning to learn view-consistent information and distinguish outliers by the degree of consistency.
Experimental results on four benchmark datasets demonstrate that our proposed approach could outperform state-of-the-art competitors.
arXiv Detail & Related papers (2024-08-02T14:34:27Z) - Adversarial Robustness of VAEs across Intersectional Subgroups [4.420073761023326]
Variational Autoencoders (VAEs) show stronger resistance to adversarial perturbations compared to deterministic AEs.
This study evaluates the robustness of VAEs against non-targeted adversarial attacks.
arXiv Detail & Related papers (2024-07-04T11:53:51Z) - Social Bias Probing: Fairness Benchmarking for Language Models [38.180696489079985]
This paper proposes a novel framework for probing language models for social biases by assessing disparate treatment.
We curate SoFa, a large-scale benchmark designed to address the limitations of existing fairness collections.
We show that biases within language models are more nuanced than acknowledged, indicating a broader scope of encoded biases than previously recognized.
arXiv Detail & Related papers (2023-11-15T16:35:59Z) - Towards Fair Face Verification: An In-depth Analysis of Demographic
Biases [11.191375513738361]
Deep learning-based person identification and verification systems have remarkably improved in terms of accuracy in recent years.
However, such systems have been found to exhibit significant biases related to race, age, and gender.
This paper presents an in-depth analysis, with a particular emphasis on the intersectionality of these demographic factors.
arXiv Detail & Related papers (2023-07-19T14:49:14Z) - Measuring Fairness Under Unawareness of Sensitive Attributes: A
Quantification-Based Approach [131.20444904674494]
We tackle the problem of measuring group fairness under unawareness of sensitive attributes.
We show that quantification approaches are particularly suited to tackle the fairness-under-unawareness problem.
arXiv Detail & Related papers (2021-09-17T13:45:46Z) - Cross-geographic Bias Detection in Toxicity Modeling [9.128264779870538]
We introduce a weakly supervised method to robustly detect lexical biases in broader geocultural contexts.
We demonstrate that our method identifies salient groups of errors, and, in a follow up, demonstrate that these groupings reflect human judgments of offensive and inoffensive language in those geographic contexts.
arXiv Detail & Related papers (2021-04-14T17:32:05Z) - Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z) - Mitigating Face Recognition Bias via Group Adaptive Classifier [53.15616844833305]
This work aims to learn a fair face representation, where faces of every group could be more equally represented.
Our work is able to mitigate face recognition bias across demographic groups while maintaining the competitive accuracy.
arXiv Detail & Related papers (2020-06-13T06:43:37Z) - Enhancing Facial Data Diversity with Style-based Face Aging [59.984134070735934]
In particular, face datasets are typically biased in terms of attributes such as gender, age, and race.
We propose a novel, generative style-based architecture for data augmentation that captures fine-grained aging patterns.
We show that the proposed method outperforms state-of-the-art algorithms for age transfer.
arXiv Detail & Related papers (2020-06-06T21:53:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.