Related papers: Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

URL: http://arxiv.org/abs/2306.01117v1
Date: Thu, 1 Jun 2023 20:05:05 GMT
Title: Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning
Authors: Sullam Jeoung, Jana Diesner, Halil Kilicoglu
Abstract summary: First names may serve as proxies for socio-demographic representations. We study whether a model's reasoning given a specific input differs based on the first names provided.
Score: 2.013330800976407
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As language models continue to be integrated into applications of personal and societal relevance, ensuring these models' trustworthiness is crucial, particularly with respect to producing consistent outputs regardless of sensitive attributes. Given that first names may serve as proxies for (intersectional) socio-demographic representations, it is imperative to examine the impact of first names on commonsense reasoning capabilities. In this paper, we study whether a model's reasoning given a specific input differs based on the first names provided. Our underlying assumption is that the reasoning about Alice should not differ from the reasoning about James. We propose and implement a controlled experimental framework to measure the causal effect of first names on commonsense reasoning, enabling us to distinguish between model predictions due to chance and caused by actual factors of interest. Our results indicate that the frequency of first names has a direct effect on model prediction, with less frequent names yielding divergent predictions compared to more frequent names. To gain insights into the internal mechanisms of models that are contributing to these behaviors, we also conduct an in-depth explainable analysis. Overall, our findings suggest that to ensure model robustness, it is essential to augment datasets with more diverse first names during the configuration stage.

Related papers

Causal Inference Isn't Special: Why It's Just Another Prediction Problem [1.90365714903665]
Causal inference is often portrayed as distinct from predictive modeling. But at its core, causal inference is simply a structured instance of prediction under distribution shift. This perspective reframes causal estimation as a familiar generalization problem.
arXiv Detail & Related papers (2025-04-06T01:37:50Z)
Uncovering Name-Based Biases in Large Language Models Through Simulated Trust Game [0.0]
Gender and race inferred from an individual's name are a notable source of stereotypes and biases that subtly influence social interactions. We show that our approach can detect name-based biases in both base and instruction-tuned models.
arXiv Detail & Related papers (2024-04-23T02:21:17Z)
Estimating the Causal Effects of Natural Logic Features in Transformer-Based NLI Models [16.328341121232484]
We apply causal effect estimation strategies to measure the effect of context interventions. We investigate robustness to irrelevant changes and sensitivity to impactful changes of Transformers.
arXiv Detail & Related papers (2024-04-03T10:22:35Z)
Predictive Churn with the Set of Good Models [61.00058053669447]
This paper explores connections between two seemingly unrelated concepts of predictive inconsistency. The first, known as predictive multiplicity, occurs when models that perform similarly produce conflicting predictions for individual samples. The second concept, predictive churn, examines the differences in individual predictions before and after model updates.
arXiv Detail & Related papers (2024-02-12T16:15:25Z)
Nonparametric Identifiability of Causal Representations from Unknown Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables. Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z)
Nichelle and Nancy: The Influence of Demographic Attributes and Tokenization Length on First Name Biases [12.459949725707315]
We find that demographic attributes of a name (race, ethnicity, and gender) and name tokenization length are both factors that systematically affect the behavior of social commonsense reasoning models.
arXiv Detail & Related papers (2023-05-26T01:57:42Z)
Rationalizing Predictions by Adversarial Information Calibration [65.19407304154177]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction. We use an adversarial technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
arXiv Detail & Related papers (2023-01-15T03:13:09Z)
Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models. We provide a language for describing how training data influences predictions, through a causal framework. Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z)
Causality Inspired Representation Learning for Domain Generalization [47.574964496891404]
We introduce a general structural causal model to formalize the Domain generalization problem. Our goal is to extract the causal factors from inputs and then reconstruct the invariant causal mechanisms. We highlight that ideal causal factors should meet three basic properties: separated from the non-causal ones, jointly independent, and causally sufficient for the classification.
arXiv Detail & Related papers (2022-03-27T08:08:33Z)
Variational Auto-Encoder Architectures that Excel at Causal Inference [26.731576721694648]
Estimating causal effects from observational data is critical for making many types of decisions. One approach to address this task is to learn decomposed representations of the underlying factors of data. In this paper, we take a generative approach that builds on the recent advances in Variational Auto-Encoders.
arXiv Detail & Related papers (2021-11-11T22:37:43Z)
On Shapley Credit Allocation for Interpretability [1.52292571922932]
We emphasize the importance of asking the right question when interpreting the decisions of a learning model. This paper quantifies feature relevance by weaving different natures of interpretations together with different measures as characteristic functions for Shapley symmetrization.
arXiv Detail & Related papers (2020-12-10T08:25:32Z)
Learning Causal Semantic Representation for Out-of-Distribution Prediction [125.38836464226092]
We propose a Causal Semantic Generative model (CSG) based on a causal reasoning so that the two factors are modeled separately. We show that CSG can identify the semantic factor by fitting training data, and this semantic-identification guarantees the boundedness of OOD generalization error.
arXiv Detail & Related papers (2020-11-03T13:16:05Z)
CausalVAE: Structured Causal Disentanglement in Variational Autoencoder [52.139696854386976]
The framework of variational autoencoder (VAE) is commonly used to disentangle independent factors from observations. We propose a new VAE based framework named CausalVAE, which includes a Causal Layer to transform independent factors into causal endogenous ones. Results show that the causal representations learned by CausalVAE are semantically interpretable, and their causal relationship as a Directed Acyclic Graph (DAG) is identified with good accuracy.
arXiv Detail & Related papers (2020-04-18T20:09:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.