Mitigating Word Bias in Zero-shot Prompt-based Classifiers
- URL: http://arxiv.org/abs/2309.04992v1
- Date: Sun, 10 Sep 2023 10:57:41 GMT
- Title: Mitigating Word Bias in Zero-shot Prompt-based Classifiers
- Authors: Adian Liusie, Potsawee Manakul, Mark J. F. Gales
- Abstract summary: We show that matching class priors correlates strongly with the oracle upper bound performance.
We also demonstrate large consistent performance gains for prompt settings over a range of NLP tasks.
- Score: 55.60306377044225
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompt-based classifiers are an attractive approach for zero-shot
classification. However, the precise choice of the prompt template and label
words can largely influence performance, with semantically equivalent settings
often showing notable performance difference. This discrepancy can be partly
attributed to word biases, where the classifier may be biased towards classes.
To address this problem, it is possible to optimise classification thresholds
on a labelled data set, however, this mitigates some of the advantages of
prompt-based classifiers. This paper instead approaches this problem by
examining the expected marginal probabilities of the classes. Here,
probabilities are reweighted to have a uniform prior over classes, in an
unsupervised fashion. Further, we draw a theoretical connection between the
class priors and the language models' word prior, and offer the ability to set
a threshold in a zero-resource fashion. We show that matching class priors
correlates strongly with the oracle upper bound performance and demonstrate
large consistent performance gains for prompt settings over a range of NLP
tasks.
Related papers
- ProTeCt: Prompt Tuning for Taxonomic Open Set Classification [59.59442518849203]
Few-shot adaptation methods do not fare well in the taxonomic open set (TOS) setting.
We propose a prompt tuning technique that calibrates the hierarchical consistency of model predictions.
A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities.
arXiv Detail & Related papers (2023-06-04T02:55:25Z) - Learning Context-aware Classifier for Semantic Segmentation [88.88198210948426]
In this paper, contextual hints are exploited via learning a context-aware classifier.
Our method is model-agnostic and can be easily applied to generic segmentation models.
With only negligible additional parameters and +2% inference time, decent performance gain has been achieved on both small and large models.
arXiv Detail & Related papers (2023-03-21T07:00:35Z) - Class Enhancement Losses with Pseudo Labels for Zero-shot Semantic
Segmentation [40.09476732999614]
Mask proposal models have significantly improved the performance of zero-shot semantic segmentation.
The use of a background' embedding during training in these methods is problematic as the resulting model tends to over-learn and assign all unseen classes as the background class instead of their correct labels.
This paper proposes novel class enhancement losses to bypass the use of the background embbedding during training, and simultaneously exploit the semantic relationship between text embeddings and mask proposals by ranking the similarity scores.
arXiv Detail & Related papers (2023-01-18T06:55:02Z) - CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class
Classification [57.62886091828512]
We propose a brand-new prefix-tuning method, Counterfactual Contrastive Prefix-tuning (CCPrefix) for many-class classification.
Basically, an instance-dependent soft prefix, derived from fact-counterfactual pairs in the label space, is leveraged to complement the language verbalizers in many-class classification.
arXiv Detail & Related papers (2022-11-11T03:45:59Z) - Exploring Category-correlated Feature for Few-shot Image Classification [27.13708881431794]
We present a simple yet effective feature rectification method by exploring the category correlation between novel and base classes as the prior knowledge.
The proposed approach consistently obtains considerable performance gains on three widely used benchmarks.
arXiv Detail & Related papers (2021-12-14T08:25:24Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Scalable Optimal Classifiers for Adversarial Settings under Uncertainty [10.90668635921398]
We consider the problem of finding optimal classifiers in an adversarial setting where the class-1 data is generated by an attacker whose objective is not known to the defender.
We show that this low-dimensional characterization enables to develop a training method to compute provably approximately optimal classifiers in a scalable manner.
arXiv Detail & Related papers (2021-06-28T13:33:53Z) - Classifier-independent Lower-Bounds for Adversarial Robustness [13.247278149124757]
We theoretically analyse the limits of robustness to test-time adversarial and noisy examples in classification.
We use optimal transport theory to derive variational formulae for the Bayes-optimal error a classifier can make on a given classification problem.
We derive explicit lower-bounds on the Bayes-optimal error in the case of the popular distance-based attacks.
arXiv Detail & Related papers (2020-06-17T16:46:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.