Related papers: "One-Size-Fits-All"? Examining Expectations around What Constitute "Fair" or "Good" NLG System Behaviors

"One-Size-Fits-All"? Examining Expectations around What Constitute "Fair" or "Good" NLG System Behaviors

URL: http://arxiv.org/abs/2310.15398v2
Date: Wed, 3 Apr 2024 07:20:15 GMT
Title: "One-Size-Fits-All"? Examining Expectations around What Constitute "Fair" or "Good" NLG System Behaviors
Authors: Li Lucy, Su Lin Blodgett, Milad Shokouhi, Hanna Wallach, Alexandra Olteanu,
Abstract summary: We conduct case studies in which we perturb different types of identity-related language features (names, roles, locations, dialect, and style) in NLG system inputs. We find that motivations for adaptation include social norms, cultural differences, feature-specific information, and accommodation. In contrast, motivations for invariance include perspectives that favor prescriptivism, view adaptation as unnecessary or too difficult for NLG systems to do appropriately, and are wary of false assumptions.
Score: 57.63649797577999
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fairness-related assumptions about what constitute appropriate NLG system behaviors range from invariance, where systems are expected to behave identically for social groups, to adaptation, where behaviors should instead vary across them. To illuminate tensions around invariance and adaptation, we conduct five case studies, in which we perturb different types of identity-related language features (names, roles, locations, dialect, and style) in NLG system inputs. Through these cases studies, we examine people's expectations of system behaviors, and surface potential caveats of these contrasting yet commonly held assumptions. We find that motivations for adaptation include social norms, cultural differences, feature-specific information, and accommodation; in contrast, motivations for invariance include perspectives that favor prescriptivism, view adaptation as unnecessary or too difficult for NLG systems to do appropriately, and are wary of false assumptions. Our findings highlight open challenges around what constitute "fair" or "good" NLG system behaviors.

Related papers

EgoNormia: Benchmarking Physical Social Norm Understanding [52.87904722234434]
We present EgoNormia $|epsilon|$, consisting of 1,853 ego-centric videos of human interactions. The normative actions encompass seven categories: safety, privacy, proxemics, politeness, cooperation, coordination/proactivity, and communication/legibility. Our work demonstrates that current state-of-the-art vision-language models lack robust norm understanding, scoring a maximum of 45% on EgoNormia.
arXiv Detail & Related papers (2025-02-27T19:54:16Z)
The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models [78.69526166193236]
Pre-trained Language models (PLMs) have been acknowledged to contain harmful information, such as social biases. We propose sc Social Bias Neurons to accurately pinpoint units (i.e., neurons) in a language model that can be attributed to undesirable behavior, such as social bias. As measured by prior metrics from StereoSet, our model achieves a higher degree of fairness while maintaining language modeling ability with low cost.
arXiv Detail & Related papers (2024-06-14T15:41:06Z)
FAIRO: Fairness-aware Adaptation in Sequential-Decision Making for Human-in-the-Loop Systems [8.713442325649801]
We propose a novel algorithm for fairness-aware sequential-decision making in Human-in-the-Loop (HITL) adaptation. In particular, FAIRO decomposes this complex fairness task into adaptive sub-tasks based on individual human preferences. We show that FAIRO can improve fairness compared with other methods across all three applications by 35.36%.
arXiv Detail & Related papers (2023-07-12T00:35:19Z)
ACROCPoLis: A Descriptive Framework for Making Sense of Fairness [6.4686347616068005]
We propose the ACROCPoLis framework to represent allocation processes with a modeling emphasis on fairness aspects. The framework provides a shared vocabulary in which the factors relevant to fairness assessments for different situations and procedures are made explicit.
arXiv Detail & Related papers (2023-04-19T21:14:57Z)
Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems. Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts. We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z)
Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization? [90.35044668396591]
A recurring theme in machine learning is algorithmic monoculture: the same systems, or systems that share components, are deployed by multiple decision-makers. We propose the component-sharing hypothesis: if decision-makers share components like training data or specific models, then they will produce more homogeneous outcomes. We test this hypothesis on algorithmic fairness benchmarks, demonstrating that sharing training data reliably exacerbates homogenization. We conclude with philosophical analyses of and societal challenges for outcome homogenization, with an eye towards implications for deployed machine learning systems.
arXiv Detail & Related papers (2022-11-25T09:33:11Z)
Survey on Fairness Notions and Related Tensions [4.257210316104905]
Automated decision systems are increasingly used to take consequential decisions in problems such as job hiring and loan granting. However, objective machine learning (ML) algorithms are prone to bias, which results in yet unfair decisions. This paper surveys the commonly used fairness notions and discusses the tensions among them with privacy and accuracy.
arXiv Detail & Related papers (2022-09-16T13:36:05Z)
Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features. We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z)
Adaptation to Unknown Situations as the Holy Grail of Learning-Based Self-Adaptive Systems: Research Directions [3.7311680121118345]
We argue that adapting to unknown situations is the ultimate challenge for self-adaptive systems. We close discussing whether, even when we can, we should indeed build systems that define their own behaviour and adapt their goals, without involving a human supervisor.
arXiv Detail & Related papers (2021-03-11T19:07:02Z)
Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences [36.884156839960184]
We investigate whether contemporary NLG models can function as behavioral priors for systems deployed in social settings. We introduce 'Moral Stories', a crowd-sourced dataset of structured, branching narratives for the study of grounded, goal-oriented social reasoning.
arXiv Detail & Related papers (2020-12-31T17:28:01Z)
COSMO: Conditional SEQ2SEQ-based Mixture Model for Zero-Shot Commonsense Question Answering [50.65816570279115]
Identification of the implicit causes and effects of a social context is the driving capability which can enable machines to perform commonsense reasoning. Current approaches in this realm lack the ability to perform commonsense reasoning upon facing an unseen situation. We present Conditional SEQ2SEQ-based Mixture model (COSMO), which provides us with the capabilities of dynamic and diverse content generation.
arXiv Detail & Related papers (2020-11-02T07:08:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.