Examining the Causal Effect of First Names on Language Models: The Case
of Social Commonsense Reasoning
- URL: http://arxiv.org/abs/2306.01117v1
- Date: Thu, 1 Jun 2023 20:05:05 GMT
- Title: Examining the Causal Effect of First Names on Language Models: The Case
of Social Commonsense Reasoning
- Authors: Sullam Jeoung, Jana Diesner, Halil Kilicoglu
- Abstract summary: First names may serve as proxies for socio-demographic representations.
We study whether a model's reasoning given a specific input differs based on the first names provided.
- Score: 2.013330800976407
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As language models continue to be integrated into applications of personal
and societal relevance, ensuring these models' trustworthiness is crucial,
particularly with respect to producing consistent outputs regardless of
sensitive attributes. Given that first names may serve as proxies for
(intersectional) socio-demographic representations, it is imperative to examine
the impact of first names on commonsense reasoning capabilities. In this paper,
we study whether a model's reasoning given a specific input differs based on
the first names provided. Our underlying assumption is that the reasoning about
Alice should not differ from the reasoning about James. We propose and
implement a controlled experimental framework to measure the causal effect of
first names on commonsense reasoning, enabling us to distinguish between model
predictions due to chance and caused by actual factors of interest. Our results
indicate that the frequency of first names has a direct effect on model
prediction, with less frequent names yielding divergent predictions compared to
more frequent names. To gain insights into the internal mechanisms of models
that are contributing to these behaviors, we also conduct an in-depth
explainable analysis. Overall, our findings suggest that to ensure model
robustness, it is essential to augment datasets with more diverse first names
during the configuration stage.
Related papers
- Uncovering Name-Based Biases in Large Language Models Through Simulated Trust Game [0.0]
Gender and race inferred from an individual's name are a notable source of stereotypes and biases that subtly influence social interactions.
We show that our approach can detect name-based biases in both base and instruction-tuned models.
arXiv Detail & Related papers (2024-04-23T02:21:17Z) - Estimating the Causal Effects of Natural Logic Features in Transformer-Based NLI Models [16.328341121232484]
We apply causal effect estimation strategies to measure the effect of context interventions.
We investigate robustness to irrelevant changes and sensitivity to impactful changes of Transformers.
arXiv Detail & Related papers (2024-04-03T10:22:35Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - Nichelle and Nancy: The Influence of Demographic Attributes and
Tokenization Length on First Name Biases [12.459949725707315]
We find that demographic attributes of a name (race, ethnicity, and gender) and name tokenization length are both factors that systematically affect the behavior of social commonsense reasoning models.
arXiv Detail & Related papers (2023-05-26T01:57:42Z) - Rationalizing Predictions by Adversarial Information Calibration [65.19407304154177]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction.
We use an adversarial technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
arXiv Detail & Related papers (2023-01-15T03:13:09Z) - Measuring Causal Effects of Data Statistics on Language Model's
`Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models.
We provide a language for describing how training data influences predictions, through a causal framework.
Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z) - Causality Inspired Representation Learning for Domain Generalization [47.574964496891404]
We introduce a general structural causal model to formalize the Domain generalization problem.
Our goal is to extract the causal factors from inputs and then reconstruct the invariant causal mechanisms.
We highlight that ideal causal factors should meet three basic properties: separated from the non-causal ones, jointly independent, and causally sufficient for the classification.
arXiv Detail & Related papers (2022-03-27T08:08:33Z) - Variational Auto-Encoder Architectures that Excel at Causal Inference [26.731576721694648]
Estimating causal effects from observational data is critical for making many types of decisions.
One approach to address this task is to learn decomposed representations of the underlying factors of data.
In this paper, we take a generative approach that builds on the recent advances in Variational Auto-Encoders.
arXiv Detail & Related papers (2021-11-11T22:37:43Z) - On Shapley Credit Allocation for Interpretability [1.52292571922932]
We emphasize the importance of asking the right question when interpreting the decisions of a learning model.
This paper quantifies feature relevance by weaving different natures of interpretations together with different measures as characteristic functions for Shapley symmetrization.
arXiv Detail & Related papers (2020-12-10T08:25:32Z) - Learning Causal Semantic Representation for Out-of-Distribution
Prediction [125.38836464226092]
We propose a Causal Semantic Generative model (CSG) based on a causal reasoning so that the two factors are modeled separately.
We show that CSG can identify the semantic factor by fitting training data, and this semantic-identification guarantees the boundedness of OOD generalization error.
arXiv Detail & Related papers (2020-11-03T13:16:05Z) - CausalVAE: Structured Causal Disentanglement in Variational Autoencoder [52.139696854386976]
The framework of variational autoencoder (VAE) is commonly used to disentangle independent factors from observations.
We propose a new VAE based framework named CausalVAE, which includes a Causal Layer to transform independent factors into causal endogenous ones.
Results show that the causal representations learned by CausalVAE are semantically interpretable, and their causal relationship as a Directed Acyclic Graph (DAG) is identified with good accuracy.
arXiv Detail & Related papers (2020-04-18T20:09:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.