Logit Separability-Driven Samples and Multiple Class-Related Words Selection for Advancing In-Context Learning
- URL: http://arxiv.org/abs/2406.10908v5
- Date: Tue, 15 Oct 2024 10:09:01 GMT
- Title: Logit Separability-Driven Samples and Multiple Class-Related Words Selection for Advancing In-Context Learning
- Authors: Zhu Zixiao, Feng Zijian, Zhou Hanzhang, Qian Junlang, Mao Kezhi,
- Abstract summary: We introduce logit separability, a criterion to assess the clarity of both samples and class-related words at the logit level.
We find that incorporating multiple class-related words for each sample, rather than relying on a single class name, improves performance by offering a broader range of label information.
We propose LICL, a logit separability-based method that jointly organizes samples and integrates multiple class-related words into each sample-label pair.
- Score: 0.0
- License:
- Abstract: Effective organization of in-context learning (ICL) demonstrations is key to improving the quality of large language model (LLM) responses. To create better sample-label pairs that instruct LLM understanding, we introduce logit separability, a criterion to assess the clarity of both samples and class-related words at the logit level. This facilitates the optimization of sample and label selection, enhancing the precision of information provided in ICL demonstrations. Additionally, we find that incorporating multiple class-related words for each sample, rather than relying on a single class name, improves performance by offering a broader range of label information. Building on these insights, we propose LICL, a logit separability-based method that jointly organizes samples and integrates multiple class-related words into each sample-label pair. Evaluations across seven classification datasets show that this approach significantly improves ICL performance by providing clearer instructions and richer label information.
Related papers
- Exploiting Conjugate Label Information for Multi-Instance Partial-Label Learning [61.00359941983515]
Multi-instance partial-label learning (MIPL) addresses scenarios where each training sample is represented as a multi-instance bag associated with a candidate label set containing one true label and several false positives.
ELIMIPL exploits the conjugate label information to improve the disambiguation performance.
arXiv Detail & Related papers (2024-08-26T15:49:31Z) - Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning [23.671999163027284]
This paper proposes a novel framework for multi-label image recognition without any training data.
It uses knowledge of pre-trained Large Language Model to learn prompts to adapt pretrained Vision-Language Model like CLIP to multilabel classification.
Our framework presents a new way to explore the synergies between multiple pre-trained models for novel category recognition.
arXiv Detail & Related papers (2024-03-02T13:43:32Z) - Learning Label Hierarchy with Supervised Contrastive Learning [8.488965459026678]
Supervised contrastive learning (SCL) frameworks treat each class as independent and thus consider all classes to be equally important.
This paper introduces a family of Label-Aware SCL methods (LASCL) that incorporates hierarchical information to SCL by leveraging similarities between classes.
Experiments on three datasets show that the proposed LASCL works well on text classification of distinguishing a single label among multi-labels.
arXiv Detail & Related papers (2024-01-31T23:21:40Z) - Prompt-based Pseudo-labeling Strategy for Sample-Efficient Semi-Supervised Extractive Summarization [12.582774521907227]
Semi-supervised learning (SSL) is a widely used technique in scenarios where labeled data is scarce and unlabeled data is abundant.
Standard SSL methods follow a teacher-student paradigm to first train a classification model and then use the classifier's confidence values to select pseudo-labels.
We propose a prompt-based pseudo-labeling strategy with LLMs that picks unlabeled examples with more accurate pseudo-labels.
arXiv Detail & Related papers (2023-11-16T04:29:41Z) - Channel-Wise Contrastive Learning for Learning with Noisy Labels [60.46434734808148]
We introduce channel-wise contrastive learning (CWCL) to distinguish authentic label information from noise.
Unlike conventional instance-wise contrastive learning (IWCL), CWCL tends to yield more nuanced and resilient features aligned with the authentic labels.
Our strategy is twofold: firstly, using CWCL to extract pertinent features to identify cleanly labeled samples, and secondly, progressively fine-tuning using these samples.
arXiv Detail & Related papers (2023-08-14T06:04:50Z) - Disambiguated Attention Embedding for Multi-Instance Partial-Label
Learning [68.56193228008466]
In many real-world tasks, the concerned objects can be represented as a multi-instance bag associated with a candidate label set.
Existing MIPL approach follows the instance-space paradigm by assigning augmented candidate label sets of bags to each instance and aggregating bag-level labels from instance-level labels.
We propose an intuitive algorithm named DEMIPL, i.e., Disambiguated attention Embedding for Multi-Instance Partial-Label learning.
arXiv Detail & Related papers (2023-05-26T13:25:17Z) - M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning)
It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario.
Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z) - Exploiting Diversity of Unlabeled Data for Label-Efficient
Semi-Supervised Active Learning [57.436224561482966]
Active learning is a research area that addresses the issues of expensive labeling by selecting the most important samples for labeling.
We introduce a new diversity-based initial dataset selection algorithm to select the most informative set of samples for initial labeling in the active learning setting.
Also, we propose a novel active learning query strategy, which uses diversity-based sampling on consistency-based embeddings.
arXiv Detail & Related papers (2022-07-25T16:11:55Z) - Using Representation Expressiveness and Learnability to Evaluate
Self-Supervised Learning Methods [61.49061000562676]
We introduce Cluster Learnability (CL) to assess learnability.
CL is measured in terms of the performance of a KNN trained to predict labels obtained by clustering the representations with K-means.
We find that CL better correlates with in-distribution model performance than other competing recent evaluation schemes.
arXiv Detail & Related papers (2022-06-02T19:05:13Z) - Active Learning in Incomplete Label Multiple Instance Multiple Label
Learning [17.5720245903743]
We propose a novel bag-class pair based approach for active learning in the MIML setting.
Our approach is based on a discriminative graphical model with efficient and exact inference.
arXiv Detail & Related papers (2021-07-22T17:01:28Z) - Generalized Label Enhancement with Sample Correlations [24.582764493585362]
We propose two novel label enhancement methods, i.e., Label Enhancement with Sample Correlations (LESC) and generalized Label Enhancement with Sample Correlations (gLESC)
Benefitting from the sample correlations, the proposed methods can boost the performance of label enhancement.
arXiv Detail & Related papers (2020-04-07T03:32:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.