Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models
- URL: http://arxiv.org/abs/2410.19195v1
- Date: Thu, 24 Oct 2024 22:59:23 GMT
- Title: Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models
- Authors: Yue Li, Zhixue Zhao, Carolina Scarton,
- Abstract summary: In-context learning (ICL) performance is sensitive to the prompt design, yet the impact of class label options in zero-shot classification has been largely overlooked.
This study presents the first comprehensive empirical study investigating how label option influences zero-shot ICL classification performance.
- Score: 10.699636123243138
- License:
- Abstract: In-context learning (ICL) performance is known to be sensitive to the prompt design, yet the impact of class label options in zero-shot classification has been largely overlooked. This study presents the first comprehensive empirical study investigating how label option (e.g., lexical choice, order, and elaboration) influences zero-shot ICL classification performance. Our findings reveal that lexical choices for label names (e.g., agree vs.support in stance classification) play an important role, with effects also linked to label orders. An analysis of the model internal states further shows that optimal label names tend to activate fewer outlier neurons in the feed forward network. Based on this observation, we propose Label set Optimization via Activation Distribution kurtosiS (LOADS), a post-hoc approach requiring no gradient propagation. LOADS not only demonstrates effectiveness with only 100 unlabelled samples across different model types and sizes, but also shows cross-lingual transferability.
Related papers
- From Biased Selective Labels to Pseudo-Labels: An Expectation-Maximization Framework for Learning from Biased Decisions [9.440055827786596]
We study a clinically-inspired selective label problem called disparate censorship.
Disparate Censorship Expectation-Maximization (DCEM) is an algorithm for learning in the presence of such censorship.
arXiv Detail & Related papers (2024-06-27T03:33:38Z) - Posterior Label Smoothing for Node Classification [2.737276507021477]
We propose a simple yet effective label smoothing for the transductive node classification task.
We design the soft label to encapsulate the local context of the target node through the neighborhood label distribution.
In the following analysis, we find that incorporating global label statistics in posterior computation is the key to the success of label smoothing.
arXiv Detail & Related papers (2024-06-01T11:59:49Z) - Label Propagation for Zero-shot Classification with Vision-Language Models [17.50253820510074]
In this paper, we tackle the case of zero-shot classification in the presence of unlabeled data.
We introduce ZLaP, a method based on label propagation (LP) that utilizes geodesic distances for classification.
We perform extensive experiments to evaluate the effectiveness of our method on 14 common datasets and show that ZLaP outperforms the latest related works.
arXiv Detail & Related papers (2024-04-05T12:58:07Z) - VLM-CPL: Consensus Pseudo Labels from Vision-Language Models for Human Annotation-Free Pathological Image Classification [23.08368823707528]
We present a novel human annotation-free method for pathology image classification by leveraging pre-trained Vision-Language Models (VLMs)
We introduce VLM-CPL, a novel approach based on consensus pseudo labels that integrates two noisy label filtering techniques with a semi-supervised learning strategy.
Experimental results showed that our method obtained an accuracy of 87.1% and 95.1% on the HPH and LC25K datasets, respectively.
arXiv Detail & Related papers (2024-03-23T13:24:30Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels.
We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Instance-Dependent Partial Label Learning [69.49681837908511]
Partial label learning is a typical weakly supervised learning problem.
Most existing approaches assume that the incorrect labels in each training example are randomly picked as the candidate labels.
In this paper, we consider instance-dependent and assume that each example is associated with a latent label distribution constituted by the real number of each label.
arXiv Detail & Related papers (2021-10-25T12:50:26Z) - Unsupervised Selective Labeling for More Effective Semi-Supervised
Learning [46.414510522978425]
unsupervised selective labeling consistently improves SSL methods over state-of-the-art active learning given labeled data.
Our work sets a new standard for practical and efficient SSL.
arXiv Detail & Related papers (2021-10-06T18:25:50Z) - Rethinking Pseudo Labels for Semi-Supervised Object Detection [84.697097472401]
We introduce certainty-aware pseudo labels tailored for object detection.
We dynamically adjust the thresholds used to generate pseudo labels and reweight loss functions for each category to alleviate the class imbalance problem.
Our approach improves supervised baselines by up to 10% AP using only 1-10% labeled data from COCO.
arXiv Detail & Related papers (2021-06-01T01:32:03Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.