ParaICL: Towards Robust Parallel In-Context Learning
- URL: http://arxiv.org/abs/2404.00570v1
- Date: Sun, 31 Mar 2024 05:56:15 GMT
- Title: ParaICL: Towards Robust Parallel In-Context Learning
- Authors: Xingxuan Li, Xuan-Phi Nguyen, Shafiq Joty, Lidong Bing,
- Abstract summary: Large language models (LLMs) have become the norm in natural language processing.
Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.
We propose a novel method named parallel in-context learning (ParaICL)
- Score: 74.38022919598443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have become the norm in natural language processing (NLP), excelling in few-shot in-context learning (ICL) with their remarkable abilities. Nonetheless, the success of ICL largely hinges on the choice of few-shot demonstration examples, making the selection process increasingly crucial. Existing methods have delved into optimizing the quantity and semantic similarity of these examples to improve ICL performances. However, our preliminary experiments indicate that the effectiveness of ICL is limited by the length of the input context. Moreover, varying combinations of few-shot demonstration examples can significantly boost accuracy across different test samples. To address this, we propose a novel method named parallel in-context learning (ParaICL) that effectively utilizes all demonstration examples without exceeding the manageable input context length. ParaICL employs parallel batching to distribute demonstration examples into different batches according to the semantic similarities of the questions in the demonstrations to the test question. It then computes normalized batch semantic scores for each batch. A weighted average semantic objective, constrained by adaptive plausibility, is applied to select the most appropriate tokens. Through extensive experiments, we validate the effectiveness of ParaICL and conduct ablation studies to underscore its design rationale. We further demonstrate that ParaICL can seamlessly integrate with existing methods.
Related papers
- Large Language Models Know What Makes Exemplary Contexts [42.90814615222177]
In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs)
This paper presents a unified framework for LLMs that allows them to self-select influential in-context examples to compose their contexts.
arXiv Detail & Related papers (2024-08-14T12:32:41Z) - Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars [66.823588073584]
Large language models (LLMs) have shown impressive capabilities in real-world applications.
The quality of these exemplars in the prompt greatly impacts performance.
Existing methods fail to adequately account for the impact of exemplar ordering on the performance.
arXiv Detail & Related papers (2024-05-25T08:23:05Z) - Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities [15.931776592470895]
Large language models (LLMs) can adapt to new tasks through in-context learning (ICL)
This paper proposes a novel Bayesian in-Context example Selection method (ByCS) for ICL.
arXiv Detail & Related papers (2024-04-23T03:42:48Z) - Experimental Design for Active Transductive Inference in Large Language Models [18.2671641610825]
We use active learning for adaptive prompt design and call it Active In-context Prompt Design (AIPD)
We design the LLM prompt by adaptively choosing few-shot examples from a training set to optimize performance on a test set.
We propose two algorithms, GO and SAL, which differ in how the few-shot examples are chosen.
arXiv Detail & Related papers (2024-04-12T23:27:46Z) - Revisiting Demonstration Selection Strategies in In-Context Learning [66.11652803887284]
Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL)
In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent.
We propose a data- and model-dependent demonstration selection method, textbfTopK + ConE, based on the assumption that textitthe performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples.
arXiv Detail & Related papers (2024-01-22T16:25:27Z) - Not All Demonstration Examples are Equally Beneficial: Reweighting
Demonstration Examples for In-Context Learning [32.29118942982609]
Large Language Models (LLMs) have recently gained the In-Context Learning (ICL) ability with the models scaling up.
This paper investigates how to determine approximately optimal weights for demonstration examples and how to apply them during ICL.
Experimental results on 8 text classification tasks show that our approach outperforms conventional ICL by a large margin.
arXiv Detail & Related papers (2023-10-12T13:15:11Z) - In-Context Demonstration Selection with Cross Entropy Difference [95.21947716378641]
Large language models (LLMs) can use in-context demonstrations to improve performance on zero-shot tasks.
We present a cross-entropy difference (CED) method for selecting in-context demonstrations.
arXiv Detail & Related papers (2023-05-24T05:04:00Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Iterative Forward Tuning Boosts In-Context Learning in Language Models [88.25013390669845]
In this study, we introduce a novel two-stage framework to boost in-context learning in large language models (LLMs)
Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages.
The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information accumulation.
arXiv Detail & Related papers (2023-05-22T13:18:17Z) - Compositional Exemplars for In-context Learning [21.961094715261133]
Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability.
We propose CEIL (Compositional Exemplars for In-context Learning) to model the interaction between the given input and in-context examples.
We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing.
arXiv Detail & Related papers (2023-02-11T14:02:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.