Can Prompt Probe Pretrained Language Models? Understanding the Invisible
Risks from a Causal View
- URL: http://arxiv.org/abs/2203.12258v1
- Date: Wed, 23 Mar 2022 08:10:07 GMT
- Title: Can Prompt Probe Pretrained Language Models? Understanding the Invisible
Risks from a Causal View
- Authors: Boxi Cao, Hongyu Lin, Xianpei Han, Fangchao Liu, Le Sun
- Abstract summary: Prompt-based probing has been widely used in evaluating the abilities of pretrained language models (PLMs)
This paper investigates the prompt-based probing from a causal view, highlights three critical biases which could induce biased results and conclusions, and proposes to conduct debiasing via causal intervention.
- Score: 37.625078897220305
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompt-based probing has been widely used in evaluating the abilities of
pretrained language models (PLMs). Unfortunately, recent studies have
discovered such an evaluation may be inaccurate, inconsistent and unreliable.
Furthermore, the lack of understanding its inner workings, combined with its
wide applicability, has the potential to lead to unforeseen risks for
evaluating and applying PLMs in real-world applications. To discover,
understand and quantify the risks, this paper investigates the prompt-based
probing from a causal view, highlights three critical biases which could induce
biased results and conclusions, and proposes to conduct debiasing via causal
intervention. This paper provides valuable insights for the design of unbiased
datasets, better probing frameworks and more reliable evaluations of pretrained
language models. Furthermore, our conclusions also echo that we need to rethink
the criteria for identifying better pretrained language models. We openly
released the source code and data at https://github.com/c-box/causalEval.
Related papers
- A Probabilistic Perspective on Unlearning and Alignment for Large Language Models [48.96686419141881]
We introduce the first formal probabilistic evaluation framework in Large Language Models (LLMs)
We derive novel metrics with high-probability guarantees concerning the output distribution of a model.
Our metrics are application-independent and allow practitioners to make more reliable estimates about model capabilities before deployment.
arXiv Detail & Related papers (2024-10-04T15:44:23Z) - PRobELM: Plausibility Ranking Evaluation for Language Models [12.057770969325453]
PRobELM is a benchmark designed to assess language models' ability to discern more plausible scenarios through their parametric knowledge.
Our benchmark is constructed from a dataset curated from Wikidata edit histories, tailored to align the temporal bounds of the training data for the evaluated models.
arXiv Detail & Related papers (2024-04-04T21:57:11Z) - GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language
Models [83.30078426829627]
Large language models (LLMs) have gained popularity and are being widely adopted by a large user community.
The existing evaluation methods have many constraints, and their results exhibit a limited degree of interpretability.
We propose a bias evaluation framework named GPTBIAS that leverages the high performance of LLMs to assess bias in models.
arXiv Detail & Related papers (2023-12-11T12:02:14Z) - Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Evaluate Confidence Instead of Perplexity for Zero-shot Commonsense
Reasoning [85.1541170468617]
This paper reconsiders the nature of commonsense reasoning and proposes a novel commonsense reasoning metric, Non-Replacement Confidence (NRC)
Our proposed novel method boosts zero-shot performance on two commonsense reasoning benchmark datasets and further seven commonsense question-answering datasets.
arXiv Detail & Related papers (2022-08-23T14:42:14Z) - CausaLM: Causal Model Explanation Through Counterfactual Language Models [33.29636213961804]
CausaLM is a framework for producing causal model explanations using counterfactual language representation models.
We show that language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest.
A byproduct of our method is a language representation model that is unaffected by the tested concept.
arXiv Detail & Related papers (2020-05-27T15:06:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.