Interventional Probing in High Dimensions: An NLI Case Study
- URL: http://arxiv.org/abs/2304.10346v1
- Date: Thu, 20 Apr 2023 14:34:31 GMT
- Title: Interventional Probing in High Dimensions: An NLI Case Study
- Authors: Julia Rozanova, Marco Valentino, Lucas Cordeiro, Andre Freitas
- Abstract summary: Probing strategies have been shown to detect semantic features intermediate to the "natural logic" fragment of the Natural Language Inference task (NLI)
In this work, we carry out new and existing representation-level interventions to investigate the effect of these semantic features on NLI classification.
- Score: 2.1028463367241033
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Probing strategies have been shown to detect the presence of various
linguistic features in large language models; in particular, semantic features
intermediate to the "natural logic" fragment of the Natural Language Inference
task (NLI). In the case of natural logic, the relation between the intermediate
features and the entailment label is explicitly known: as such, this provides a
ripe setting for interventional studies on the NLI models' representations,
allowing for stronger causal conjectures and a deeper critical analysis of
interventional probing methods. In this work, we carry out new and existing
representation-level interventions to investigate the effect of these semantic
features on NLI classification: we perform amnesic probing (which removes
features as directed by learned linear probes) and introduce the mnestic
probing variation (which forgets all dimensions except the probe-selected
ones). Furthermore, we delve into the limitations of these methods and outline
some pitfalls have been obscuring the effectivity of interventional probing
studies.
Related papers
- The correlation between nativelike selection and prototypicality: a multilingual onomasiological case study using semantic embedding [0.0]
This study examines the possibility of analyzing the semantic motivation and deducibility behind some nativelike selection (NLS)
To account for the NLS in question, cluster analysis and behavioral profile analysis are conducted to uncover a language-specific prototype for the Chinese verb shang 'harm'
arXiv Detail & Related papers (2024-05-22T10:55:26Z) - Estimating the Causal Effects of Natural Logic Features in Transformer-Based NLI Models [16.328341121232484]
We apply causal effect estimation strategies to measure the effect of context interventions.
We investigate robustness to irrelevant changes and sensitivity to impactful changes of Transformers.
arXiv Detail & Related papers (2024-04-03T10:22:35Z) - CausalGym: Benchmarking causal interpretability methods on linguistic
tasks [52.61917615039112]
We use CausalGym to benchmark the ability of interpretability methods to causally affect model behaviour.
We study the pythia models (14M--6.9B) and assess the causal efficacy of a wide range of interpretability methods.
We find that DAS outperforms the other methods, and so we use it to study the learning trajectory of two difficult linguistic phenomena.
arXiv Detail & Related papers (2024-02-19T21:35:56Z) - Estimating the Causal Effects of Natural Logic Features in Neural NLI
Models [2.363388546004777]
We zone in on specific patterns of reasoning with enough structure and regularity to be able to identify and quantify systematic reasoning failures in widely-used models.
We apply causal effect estimation strategies to measure the effect of context interventions.
Following related work on causal analysis of NLP models in different settings, we adapt the methodology for the NLI task to construct comparative model profiles.
arXiv Detail & Related papers (2023-05-15T12:01:09Z) - Rank-Based Causal Discovery for Post-Nonlinear Models [2.4493299476776778]
Post-nonlinear (PNL) causal models constitute one of the most flexible options for such restricted subclasses.
We propose a new approach for PNL causal discovery that uses rank-based methods to estimate the functional parameters.
arXiv Detail & Related papers (2023-02-23T21:19:23Z) - Naturalistic Causal Probing for Morpho-Syntax [76.83735391276547]
We suggest a naturalistic strategy for input-level intervention on real world data in Spanish.
Using our approach, we isolate morpho-syntactic features from counfounders in sentences.
We apply this methodology to analyze causal effects of gender and number on contextualized representations extracted from pre-trained models.
arXiv Detail & Related papers (2022-05-14T11:47:58Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - MCDAL: Maximum Classifier Discrepancy for Active Learning [74.73133545019877]
Recent state-of-the-art active learning methods have mostly leveraged Generative Adversarial Networks (GAN) for sample acquisition.
We propose in this paper a novel active learning framework that we call Maximum Discrepancy for Active Learning (MCDAL)
In particular, we utilize two auxiliary classification layers that learn tighter decision boundaries by maximizing the discrepancies among them.
arXiv Detail & Related papers (2021-07-23T06:57:08Z) - Exploring Transitivity in Neural NLI Models through Veridicality [39.845425535943534]
We focus on the transitivity of inference relations, a fundamental property for systematically drawing inferences.
A model capturing transitivity can compose basic inference patterns and draw new inferences.
We find that current NLI models do not perform consistently well on transitivity inference tasks.
arXiv Detail & Related papers (2021-01-26T11:18:35Z) - Explaining Black Box Predictions and Unveiling Data Artifacts through
Influence Functions [55.660255727031725]
Influence functions explain the decisions of a model by identifying influential training examples.
We conduct a comparison between influence functions and common word-saliency methods on representative tasks.
We develop a new measure based on influence functions that can reveal artifacts in training data.
arXiv Detail & Related papers (2020-05-14T00:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.