Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models
- URL: http://arxiv.org/abs/2410.19195v2
- Date: Tue, 26 Aug 2025 15:09:51 GMT
- Title: Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models
- Authors: Yue Li, Zhixue Zhao, Carolina Scarton,
- Abstract summary: In-context learning (ICL) performance is highly sensitive to prompt design.<n>Class label options (e.g. lexicon or order) in zero-shot classification remains underexplored.<n>This study proposes LOADS, a method for selecting optimal label sets in zero-shot ICL with large language models.
- Score: 16.130133009174124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-context learning (ICL) performance is highly sensitive to prompt design, yet the impact of class label options (e.g. lexicon or order) in zero-shot classification remains underexplored. This study proposes LOADS (Label set Optimization via Activation Distribution kurtosiS), a post-hoc method for selecting optimal label sets in zero-shot ICL with large language models (LLMs). LOADS is built upon the observations in our empirical analysis, the first to systematically examine how label option design (i.e., lexical choice, order, and elaboration) impacts classification performance. This analysis shows that the lexical choice of the labels in the prompt (such as agree vs. support in stance classification) plays an important role in both model performance and model's sensitivity to the label order. A further investigation demonstrates that optimal label words tend to activate fewer outlier neurons in LLMs' feed-forward networks. LOADS then leverages kurtosis to measure the neuron activation distribution for label selection, requiring only a single forward pass without gradient propagation or labelled data. The LOADS-selected label words consistently demonstrate effectiveness for zero-shot ICL across classification tasks, datasets, models and languages, achieving maximum performance gain from 0.54 to 0.76 compared to the conventional approach of using original dataset label words.
Related papers
- Box-Level Class-Balanced Sampling for Active Object Detection [34.79955979395035]
Active learning (AL) is a promising technique to alleviate the annotation burden.<n> Performing AL at box-level for object detection has been shown to be more cost-effective than selecting and labelling the entire image.<n>We propose a class-balanced sampling strategy to select more objects from minority classes for labelling.
arXiv Detail & Related papers (2025-08-25T09:57:22Z) - Reduction-based Pseudo-label Generation for Instance-dependent Partial Label Learning [41.345794038968776]
We propose to leverage reduction-based pseudo-labels to alleviate the influence of incorrect candidate labels.
We show that reduction-based pseudo-labels exhibit greater consistency with the Bayes optimal classifier compared to pseudo-labels directly generated from the predictive model.
arXiv Detail & Related papers (2024-10-28T07:32:20Z) - From Biased Selective Labels to Pseudo-Labels: An Expectation-Maximization Framework for Learning from Biased Decisions [9.440055827786596]
We study a clinically-inspired selective label problem called disparate censorship.
Disparate Censorship Expectation-Maximization (DCEM) is an algorithm for learning in the presence of such censorship.
arXiv Detail & Related papers (2024-06-27T03:33:38Z) - Posterior Label Smoothing for Node Classification [2.737276507021477]
We propose a simple yet effective label smoothing for the transductive node classification task.
We design the soft label to encapsulate the local context of the target node through the neighborhood label distribution.
In the following analysis, we find that incorporating global label statistics in posterior computation is the key to the success of label smoothing.
arXiv Detail & Related papers (2024-06-01T11:59:49Z) - Label Propagation for Zero-shot Classification with Vision-Language Models [17.50253820510074]
In this paper, we tackle the case of zero-shot classification in the presence of unlabeled data.
We introduce ZLaP, a method based on label propagation (LP) that utilizes geodesic distances for classification.
We perform extensive experiments to evaluate the effectiveness of our method on 14 common datasets and show that ZLaP outperforms the latest related works.
arXiv Detail & Related papers (2024-04-05T12:58:07Z) - VLM-CPL: Consensus Pseudo Labels from Vision-Language Models for Human Annotation-Free Pathological Image Classification [23.08368823707528]
We present a novel human annotation-free method for pathology image classification by leveraging pre-trained Vision-Language Models (VLMs)
We introduce VLM-CPL, a novel approach based on consensus pseudo labels that integrates two noisy label filtering techniques with a semi-supervised learning strategy.
Experimental results showed that our method obtained an accuracy of 87.1% and 95.1% on the HPH and LC25K datasets, respectively.
arXiv Detail & Related papers (2024-03-23T13:24:30Z) - Gen-Z: Generative Zero-Shot Text Classification with Contextualized
Label Descriptions [50.92702206798324]
We propose a generative prompting framework for zero-shot text classification.
GEN-Z measures the LM likelihood of input text conditioned on natural language descriptions of labels.
We show that zero-shot classification with simple contextualization of the data source consistently outperforms both zero-shot and few-shot baselines.
arXiv Detail & Related papers (2023-11-13T07:12:57Z) - SLPT: Selective Labeling Meets Prompt Tuning on Label-Limited Lesion
Segmentation [57.37875162629063]
We propose a framework that combines selective labeling with prompt tuning to boost performance in limited labels.
We evaluate our method on liver tumor segmentation and achieve state-of-the-art performance, outperforming traditional fine-tuning with only 6% of tunable parameters.
arXiv Detail & Related papers (2023-08-09T12:22:49Z) - LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and
Unlabeled Image Collections [30.875186985461063]
Large-scale pre-trained Vision and Language (VL) models have set a new state-of-the-art (SOTA) in zero-shot visual classification.
We show, for the first time, how to reduce this gap without any labels and without any paired VL data.
arXiv Detail & Related papers (2023-05-29T17:56:35Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels.
We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Self-Adaptive Label Augmentation for Semi-supervised Few-shot
Classification [121.63992191386502]
Few-shot classification aims to learn a model that can generalize well to new tasks when only a few labeled samples are available.
We propose a semi-supervised few-shot classification method that assigns an appropriate label to each unlabeled sample by a manually defined metric.
A major novelty of SALA is the task-adaptive metric, which can learn the metric adaptively for different tasks in an end-to-end fashion.
arXiv Detail & Related papers (2022-06-16T13:14:03Z) - Instance-Dependent Partial Label Learning [69.49681837908511]
Partial label learning is a typical weakly supervised learning problem.
Most existing approaches assume that the incorrect labels in each training example are randomly picked as the candidate labels.
In this paper, we consider instance-dependent and assume that each example is associated with a latent label distribution constituted by the real number of each label.
arXiv Detail & Related papers (2021-10-25T12:50:26Z) - Unsupervised Selective Labeling for More Effective Semi-Supervised
Learning [46.414510522978425]
unsupervised selective labeling consistently improves SSL methods over state-of-the-art active learning given labeled data.
Our work sets a new standard for practical and efficient SSL.
arXiv Detail & Related papers (2021-10-06T18:25:50Z) - Rethinking Pseudo Labels for Semi-Supervised Object Detection [84.697097472401]
We introduce certainty-aware pseudo labels tailored for object detection.
We dynamically adjust the thresholds used to generate pseudo labels and reweight loss functions for each category to alleviate the class imbalance problem.
Our approach improves supervised baselines by up to 10% AP using only 1-10% labeled data from COCO.
arXiv Detail & Related papers (2021-06-01T01:32:03Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z) - Delving Deep into Label Smoothing [112.24527926373084]
Label smoothing is an effective regularization tool for deep neural networks (DNNs)
We present an Online Label Smoothing (OLS) strategy, which generates soft labels based on the statistics of the model prediction for the target category.
arXiv Detail & Related papers (2020-11-25T08:03:11Z) - Semi-Supervised Speech Recognition via Graph-based Temporal
Classification [59.58318952000571]
Semi-supervised learning has demonstrated promising results in automatic speech recognition by self-training.
The effectiveness of this approach largely relies on the pseudo-label accuracy.
Alternative ASR hypotheses of an N-best list can provide more accurate labels for an unlabeled speech utterance.
arXiv Detail & Related papers (2020-10-29T14:56:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.