Prototypical Calibration for Few-shot Learning of Language Models
- URL: http://arxiv.org/abs/2205.10183v1
- Date: Fri, 20 May 2022 13:50:07 GMT
- Title: Prototypical Calibration for Few-shot Learning of Language Models
- Authors: Zhixiong Han, Yaru Hao, Li Dong, Furu Wei
- Abstract summary: GPT-like models have been recognized as fragile across different hand-crafted templates, and demonstration permutations.
We propose prototypical calibration to adaptively learn a more robust decision boundary for zero- and few-shot classification.
Our method calibrates the decision boundary as expected, greatly improving the robustness of GPT to templates, permutations, and class imbalance.
- Score: 84.5759596754605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In-context learning of GPT-like models has been recognized as fragile across
different hand-crafted templates, and demonstration permutations. In this work,
we propose prototypical calibration to adaptively learn a more robust decision
boundary for zero- and few-shot classification, instead of greedy decoding.
Concretely, our method first adopts Gaussian mixture distribution to estimate
the prototypical clusters for all categories. Then we assign each cluster to
the corresponding label by solving a weighted bipartite matching problem. Given
an example, its prediction is calibrated by the likelihood of prototypical
clusters. Experimental results show that prototypical calibration yields a 15%
absolute improvement on a diverse set of tasks. Extensive analysis across
different scales also indicates that our method calibrates the decision
boundary as expected, greatly improving the robustness of GPT to templates,
permutations, and class imbalance.
Related papers
- Calibrating Large Language Models with Sample Consistency [76.23956851098598]
We explore the potential of deriving confidence from the distribution of multiple randomly sampled model generations, via three measures of consistency.
Results show that consistency-based calibration methods outperform existing post-hoc approaches.
We offer practical guidance on choosing suitable consistency metrics for calibration, tailored to the characteristics of various LMs.
arXiv Detail & Related papers (2024-02-21T16:15:20Z) - Twice Class Bias Correction for Imbalanced Semi-Supervised Learning [59.90429949214134]
We introduce a novel approach called textbfTwice textbfClass textbfBias textbfCorrection (textbfTCBC)
We estimate the class bias of the model parameters during the training process.
We apply a secondary correction to the model's pseudo-labels for unlabeled samples.
arXiv Detail & Related papers (2023-12-27T15:06:36Z) - Multi-Head Multi-Loss Model Calibration [13.841172927454204]
We introduce a form of simplified ensembling that bypasses the costly training and inference of deep ensembles.
Specifically, each head is trained to minimize a weighted Cross-Entropy loss, but the weights are different among the different branches.
We show that the resulting averaged predictions can achieve excellent calibration without sacrificing accuracy in two challenging datasets.
arXiv Detail & Related papers (2023-03-02T09:32:32Z) - On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration.
Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration.
We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z) - On the Calibration of Pre-trained Language Models using Mixup Guided by
Area Under the Margin and Saliency [47.90235939359225]
We propose a novel mixup strategy for pre-trained language models that improves model calibration further.
Our method achieves the lowest expected calibration error compared to strong baselines on both in-domain and out-of-domain test samples.
arXiv Detail & Related papers (2022-03-14T23:45:08Z) - Heterogeneous Calibration: A post-hoc model-agnostic framework for
improved generalization [8.815439276597818]
We introduce the notion of heterogeneous calibration that applies a post-hoc model-agnostic transformation to model outputs for improving AUC performance on binary classification tasks.
We refer to simple patterns as heterogeneous partitions of the feature space and show theoretically that perfectly calibrating each partition separately optimize AUC.
While the theoretical optimality of this framework holds for any model, we focus on deep neural networks (DNNs) and test the simplest instantiation of this paradigm on a variety of open-source datasets.
arXiv Detail & Related papers (2022-02-10T05:08:50Z) - MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty
Calibration [29.780204566046503]
We propose a feature-aware binning framework, called Multiple Boosting Trees (MBCT)
Our MBCT is non-monotonic, and has the potential to improve order accuracy, due to its learnable binning scheme and the individual calibration.
Results show that our method outperforms all competing models in terms of both calibration error and order accuracy.
arXiv Detail & Related papers (2022-02-09T08:59:16Z) - Learn then Test: Calibrating Predictive Algorithms to Achieve Risk
Control [67.52000805944924]
Learn then Test (LTT) is a framework for calibrating machine learning models.
Our main insight is to reframe the risk-control problem as multiple hypothesis testing.
We use our framework to provide new calibration methods for several core machine learning tasks with detailed worked examples in computer vision.
arXiv Detail & Related papers (2021-10-03T17:42:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.