Towards Informative Few-Shot Prompt with Maximum Information Gain for
In-Context Learning
- URL: http://arxiv.org/abs/2310.08923v1
- Date: Fri, 13 Oct 2023 07:49:11 GMT
- Title: Towards Informative Few-Shot Prompt with Maximum Information Gain for
In-Context Learning
- Authors: Hongfu Liu, Ye Wang
- Abstract summary: Large Language models (LLMs) possess the capability to engage In-context Learning (ICL)
LLMs possess the capability to engage In-context Learning (ICL) by leveraging a few demonstrations pertaining to a new downstream task as conditions.
However, this particular learning paradigm suffers from high instability stemming from substantial variances induced by factors such as the input distribution of selected examples, their ordering, and prompt formats.
- Score: 30.536184852029386
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language models (LLMs) possess the capability to engage In-context
Learning (ICL) by leveraging a few demonstrations pertaining to a new
downstream task as conditions. However, this particular learning paradigm
suffers from high instability stemming from substantial variances induced by
factors such as the input distribution of selected examples, their ordering,
and prompt formats. In this work, we demonstrate that even when all these
factors are held constant, the random selection of examples still results in
high variance. Consequently, we aim to explore the informative ability of data
examples by quantifying the Information Gain (IG) obtained in prediction after
observing a given example candidate. Then we propose to sample those with
maximum IG. Additionally, we identify the presence of template bias, which can
lead to unfair evaluations of IG during the sampling process. To mitigate this
bias, we introduce Calibration Before Sampling strategy. The experimental
results illustrate that our proposed method can yield an average relative
improvement of 14.3% across six classification tasks using three LLMs.
Related papers
- Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation for Classification [6.273933281069326]
generative large language models (LLMs) are increasingly used for data augmentation tasks.
We compare sample selection strategies existing in few-shot learning literature and investigate their effects in LLM-based textual augmentation.
Results indicate, that while some informed'' selection strategies increase the performance of models, it happens only seldom and with marginal performance increases.
arXiv Detail & Related papers (2024-10-14T17:30:08Z) - Strategic Demonstration Selection for Improved Fairness in LLM In-Context Learning [18.782566259311206]
This study investigates how varying demonstrations within ICL prompts influence the fairness outcomes of large language models (LLMs)
We find that deliberately including minority group samples in prompts significantly boosts fairness without sacrificing predictive accuracy.
We introduce a mitigation technique that employs clustering and evolutionary strategies to curate a diverse and representative sample set from the training data.
arXiv Detail & Related papers (2024-08-19T07:34:43Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing.
Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image.
To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z) - In-Context Example Ordering Guided by Label Distributions [34.30216341226014]
We formulate in-context example ordering as an optimization problem.
Inspired by the idea of learning from label proportions, we propose two principles for in-context example ordering guided by model's probability predictions.
We demonstrate our approach outperforms the baselines by improving the classification accuracy, reducing model miscalibration, and also by selecting better in-context examples.
arXiv Detail & Related papers (2024-02-18T04:08:10Z) - Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Towards Robust Visual Question Answering: Making the Most of Biased
Samples via Contrastive Learning [54.61762276179205]
We propose a novel contrastive learning approach, MMBS, for building robust VQA models by Making the Most of Biased Samples.
Specifically, we construct positive samples for contrastive learning by eliminating the information related to spurious correlation from the original training samples.
We validate our contributions by achieving competitive performance on the OOD dataset VQA-CP v2 while preserving robust performance on the ID dataset VQA v2.
arXiv Detail & Related papers (2022-10-10T11:05:21Z) - Learning from a Biased Sample [3.546358664345473]
We propose a method for learning a decision rule that minimizes the worst-case risk incurred under a family of test distributions.
We empirically validate our proposed method in a case study on prediction of mental health scores from health survey data.
arXiv Detail & Related papers (2022-09-05T04:19:16Z) - Nested Variational Inference [8.610608901689577]
We develop a family of methods that learn proposals for nested importance samplers by minimizing a KL divergence at each level of nesting.
We observe that optimizing nested objectives leads to improved sample quality in terms of log average weight and effective sample size.
arXiv Detail & Related papers (2021-06-21T17:56:59Z) - Mind the Trade-off: Debiasing NLU Models without Degrading the
In-distribution Performance [70.31427277842239]
We introduce a novel debiasing method called confidence regularization.
It discourages models from exploiting biases while enabling them to receive enough incentive to learn from all the training examples.
We evaluate our method on three NLU tasks and show that, in contrast to its predecessors, it improves the performance on out-of-distribution datasets.
arXiv Detail & Related papers (2020-05-01T11:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.