Related papers: Token-based Decision Criteria Are Suboptimal in In-context Learning

Token-based Decision Criteria Are Suboptimal in In-context Learning

URL: http://arxiv.org/abs/2406.16535v3
Date: Wed, 05 Feb 2025 13:44:48 GMT
Title: Token-based Decision Criteria Are Suboptimal in In-context Learning
Authors: Hakaze Cho, Yoshihiro Sakai, Mariko Kato, Kenshiro Tanaka, Akira Ishii, Naoya Inoue,
Abstract summary: In-Context Learning (ICL) typically utilizes classification criteria from output probabilities of manually selected label tokens.<n>We propose Hidden, which renounces token probabilities and uses the nearest centroid on the LM's last hidden states.<n>Our experiments on 6 models and 10 classification datasets indicate that Hidden consistently outperforms current token-based baselines.
Score: 2.2973949268669562
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In-Context Learning (ICL) typically utilizes classification criteria from output probabilities of manually selected label tokens. However, we argue that such token-based classification criteria lead to suboptimal decision boundaries, despite delicate calibrations through translation and constrained rotation applied. To address this problem, we propose Hidden Calibration, which renounces token probabilities and uses the nearest centroid classifier on the LM's last hidden states. In detail, we assign the label of the nearest centroid previously estimated from a calibration set to the test sample as the predicted label. Our experiments on 6 models and 10 classification datasets indicate that Hidden Calibration consistently outperforms current token-based baselines by about 20%~50%, achieving a strong state-of-the-art in ICL. Our further analysis demonstrates that Hidden Calibration finds better classification criteria with less inter-class overlap, and LMs provide linearly separable intra-class clusters with the help of demonstrations, which supports Hidden Calibration and gives new insights into the principle of ICL. Our official code implementation can be found at https://github.com/hc495/Hidden_Calibration.

Related papers

Selective Classification Under Distribution Shifts [2.6541808384534478]
In selective classification, a classifier abstains from making predictions that are likely to be wrong to avoid excessive errors. We propose an SC framework that takes into account distribution shifts. We show that our proposed score functions are more effective and reliable than the existing ones for generalized SC.
arXiv Detail & Related papers (2024-05-08T15:52:50Z)
ProTeCt: Prompt Tuning for Taxonomic Open Set Classification [59.59442518849203]
Few-shot adaptation methods do not fare well in the taxonomic open set (TOS) setting. We propose a prompt tuning technique that calibrates the hierarchical consistency of model predictions. A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities.
arXiv Detail & Related papers (2023-06-04T02:55:25Z)
Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right? [135.71855998537347]
We revisit the common practice of evaluating adaptation of Online Continual Learning (OCL) algorithms through the metric of online accuracy. We show that this metric is unreliable, as even vacuous blind classifiers can achieve unrealistically high online accuracy. Existing OCL algorithms can also achieve high online accuracy, but perform poorly in retaining useful information.
arXiv Detail & Related papers (2023-05-16T08:29:33Z)
Parametric Classification for Generalized Category Discovery: A Baseline Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples. We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem. We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z)
Complementary Labels Learning with Augmented Classes [22.460256396941528]
Complementary Labels Learning (CLL) arises in many real-world tasks such as private questions classification and online learning. We propose a novel problem setting called Complementary Labels Learning with Augmented Classes (CLLAC) By using unlabeled data, we propose an unbiased estimator of classification risk for CLLAC, which is guaranteed to be provably consistent.
arXiv Detail & Related papers (2022-11-19T13:55:27Z)
Estimating Classification Confidence Using Kernel Densities [0.0]
This paper investigates the post-hoc calibration of confidence for "exploratory" machine learning classification problems. We introduce and test four new algorithms designed to handle the idiosyncrasies of category-specific confidence estimation.
arXiv Detail & Related papers (2022-07-13T21:57:44Z)
Using Representation Expressiveness and Learnability to Evaluate Self-Supervised Learning Methods [61.49061000562676]
We introduce Cluster Learnability (CL) to assess learnability. CL is measured in terms of the performance of a KNN trained to predict labels obtained by clustering the representations with K-means. We find that CL better correlates with in-distribution model performance than other competing recent evaluation schemes.
arXiv Detail & Related papers (2022-06-02T19:05:13Z)
Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm. Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z)
Rethinking Pseudo Labels for Semi-Supervised Object Detection [84.697097472401]
We introduce certainty-aware pseudo labels tailored for object detection. We dynamically adjust the thresholds used to generate pseudo labels and reweight loss functions for each category to alleviate the class imbalance problem. Our approach improves supervised baselines by up to 10% AP using only 1-10% labeled data from COCO.
arXiv Detail & Related papers (2021-06-01T01:32:03Z)
Label-Imbalanced and Group-Sensitive Classification under Overparameterization [32.923780772605596]
Label-imbalanced and group-sensitive classification seeks to appropriately modify standard training algorithms to optimize relevant metrics. We show that a logit-adjusted loss modification to standard empirical risk minimization might be ineffective in general. We show that our results extend naturally to binary classification with sensitive groups, thus treating the two common types of imbalances (label/group) in a unifying way.
arXiv Detail & Related papers (2021-03-02T08:09:43Z)
Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$. Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z)
In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation. We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models. We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
Local Temperature Scaling for Probability Calibration [22.069749881109992]
We propose a learning-based calibration method that focuses on semantic segmentation. Specifically, we adopt a convolutional neural network to predict local temperature values for probability calibration. Experiments on the COCO, CamVid, and LPBA40 datasets demonstrate improved calibration performance for a range of different metrics.
arXiv Detail & Related papers (2020-08-12T04:39:32Z)
Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions. We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test. Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
Mix-n-Match: Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning [21.08664370117846]
We show how Mix-n-Match calibration strategies can help achieve remarkably better data-efficiency and expressive power. We also reveal potential issues in standard evaluation practices. Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks.
arXiv Detail & Related papers (2020-03-16T17:00:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.