Naturalistic Causal Probing for Morpho-Syntax
- URL: http://arxiv.org/abs/2205.07043v1
- Date: Sat, 14 May 2022 11:47:58 GMT
- Title: Naturalistic Causal Probing for Morpho-Syntax
- Authors: Afra Amini, Tiago Pimentel, Clara Meister, Ryan Cotterell
- Abstract summary: We suggest a naturalistic strategy for input-level intervention on real world data in Spanish.
Using our approach, we isolate morpho-syntactic features from counfounders in sentences.
We apply this methodology to analyze causal effects of gender and number on contextualized representations extracted from pre-trained models.
- Score: 76.83735391276547
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Probing has become a go-to methodology for interpreting and analyzing deep
neural models in natural language processing. Yet recently, there has been much
debate around the limitations and weaknesses of probes. In this work, we
suggest a naturalistic strategy for input-level intervention on real world data
in Spanish, which is a language with gender marking. Using our approach, we
isolate morpho-syntactic features from counfounders in sentences, e.g. topic,
which will then allow us to causally probe pre-trained models. We apply this
methodology to analyze causal effects of gender and number on contextualized
representations extracted from pre-trained models -- BERT, RoBERTa and GPT-2.
Our experiments suggest that naturalistic intervention can give us stable
estimates of causal effects, which varies across different words in a sentence.
We further show the utility of our estimator in investigating gender bias in
adjectives, and answering counterfactual questions in masked prediction. Our
probing experiments highlights the importance of conducting causal probing in
determining if a particular property is encoded in representations.
Related papers
- QUITE: Quantifying Uncertainty in Natural Language Text in Bayesian Reasoning Scenarios [15.193544498311603]
We present QUITE, a dataset of real-world Bayesian reasoning scenarios with categorical random variables and complex relationships.
We conduct an extensive set of experiments, finding that logic-based models outperform out-of-the-box large language models on all reasoning types.
Our results provide evidence that neuro-symbolic models are a promising direction for improving complex reasoning.
arXiv Detail & Related papers (2024-10-14T12:44:59Z) - On the Proper Treatment of Tokenization in Psycholinguistics [53.960910019072436]
The paper argues that token-level language models should be marginalized into character-level language models before they are used in psycholinguistic studies.
We find various focal areas whose surprisal is a better psychometric predictor than the surprisal of the region of interest itself.
arXiv Detail & Related papers (2024-10-03T17:18:03Z) - The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models [78.69526166193236]
Pre-trained Language models (PLMs) have been acknowledged to contain harmful information, such as social biases.
We propose sc Social Bias Neurons to accurately pinpoint units (i.e., neurons) in a language model that can be attributed to undesirable behavior, such as social bias.
As measured by prior metrics from StereoSet, our model achieves a higher degree of fairness while maintaining language modeling ability with low cost.
arXiv Detail & Related papers (2024-06-14T15:41:06Z) - Using Artificial French Data to Understand the Emergence of Gender Bias
in Transformer Language Models [5.22145960878624]
This work takes an initial step towards exploring the less researched topic of how neural models discover linguistic properties of words, such as gender, as well as the rules governing their usage.
We propose to use an artificial corpus generated by a PCFG based on French to precisely control the gender distribution in the training data and determine under which conditions a model correctly captures gender information or, on the contrary, appears gender-biased.
arXiv Detail & Related papers (2023-10-24T14:08:37Z) - CUE: An Uncertainty Interpretation Framework for Text Classifiers Built
on Pre-Trained Language Models [28.750894873827068]
We propose a novel framework, called CUE, which aims to interpret uncertainties inherent in the predictions of PLM-based models.
By comparing the difference in predictive uncertainty between the perturbed and the original text representations, we are able to identify the latent dimensions responsible for uncertainty.
arXiv Detail & Related papers (2023-06-06T11:37:46Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Explaining Face Presentation Attack Detection Using Natural Language [24.265611015740287]
In this paper, we tackle the problem of explaining face presentation attack predictions through natural language.
Our approach passes feature representations of a deep layer of the PAD model to a language model to generate text describing the reasoning behind the PAD prediction.
We investigate how the quality of the generated explanations is affected by different loss functions, including the commonly used word-wise cross entropy loss, a sentence discriminative loss, and a sentence semantic loss.
arXiv Detail & Related papers (2021-11-08T22:55:55Z) - Double Perturbation: On the Robustness of Robustness and Counterfactual
Bias Evaluation [109.06060143938052]
We propose a "double perturbation" framework to uncover model weaknesses beyond the test dataset.
We apply this framework to study two perturbation-based approaches that are used to analyze models' robustness and counterfactual bias in English.
arXiv Detail & Related papers (2021-04-12T06:57:36Z) - Exploring Lexical Irregularities in Hypothesis-Only Models of Natural
Language Inference [5.283529004179579]
Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences.
Models that understand entailment should encode both, the premise and the hypothesis.
Experiments by Poliak et al. revealed a strong preference of these models towards patterns observed only in the hypothesis.
arXiv Detail & Related papers (2021-01-19T01:08:06Z) - Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals [53.484562601127195]
We point out the inability to infer behavioral conclusions from probing results.
We offer an alternative method that focuses on how the information is being used, rather than on what information is encoded.
arXiv Detail & Related papers (2020-06-01T15:00:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.