STAYKATE: Hybrid In-Context Example Selection Combining Representativeness Sampling and Retrieval-based Approach -- A Case Study on Science Domains
- URL: http://arxiv.org/abs/2412.20043v1
- Date: Sat, 28 Dec 2024 06:13:50 GMT
- Title: STAYKATE: Hybrid In-Context Example Selection Combining Representativeness Sampling and Retrieval-based Approach -- A Case Study on Science Domains
- Authors: Chencheng Zhu, Kazutaka Shimada, Tomoki Taniguchi, Tomoko Ohkuma,
- Abstract summary: Large language models (LLMs) demonstrate the ability to learn in-context, offering a potential solution for scientific information extraction.
We propose STAYKATE, a static-dynamic hybrid selection method that combines the principles of representativeness sampling from active learning with the prevalent retrieval-based approach.
The results across three domain-specific datasets indicate that STAYKATE outperforms both the traditional supervised methods and existing selection methods.
- Score: 0.8296121350988481
- License:
- Abstract: Large language models (LLMs) demonstrate the ability to learn in-context, offering a potential solution for scientific information extraction, which often contends with challenges such as insufficient training data and the high cost of annotation processes. Given that the selection of in-context examples can significantly impact performance, it is crucial to design a proper method to sample the efficient ones. In this paper, we propose STAYKATE, a static-dynamic hybrid selection method that combines the principles of representativeness sampling from active learning with the prevalent retrieval-based approach. The results across three domain-specific datasets indicate that STAYKATE outperforms both the traditional supervised methods and existing selection methods. The enhancement in performance is particularly pronounced for entity types that other methods pose challenges.
Related papers
- The Power of Adaptation: Boosting In-Context Learning through Adaptive Prompting [8.260097638532878]
Large Language Models (LLMs) have demonstrated exceptional abilities across a broad range of language-related tasks.
We propose textscAdaptive-Prompt, a novel method that adaptively selects exemplars by leveraging model feedback.
Experimental results show that textscAdaptive-Prompt significantly enhances LLM performance across a variety of reasoning tasks.
arXiv Detail & Related papers (2024-12-23T15:49:43Z) - Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process.
vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner.
We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z) - Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars [66.823588073584]
Large language models (LLMs) have shown impressive capabilities in real-world applications.
The quality of these exemplars in the prompt greatly impacts performance.
Existing methods fail to adequately account for the impact of exemplar ordering on the performance.
arXiv Detail & Related papers (2024-05-25T08:23:05Z) - Revisiting Demonstration Selection Strategies in In-Context Learning [66.11652803887284]
Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL)
In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent.
We propose a data- and model-dependent demonstration selection method, textbfTopK + ConE, based on the assumption that textitthe performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples.
arXiv Detail & Related papers (2024-01-22T16:25:27Z) - Multi-Task Learning with Summary Statistics [4.871473117968554]
We propose a flexible multi-task learning framework utilizing summary statistics from various sources.
We also present an adaptive parameter selection approach based on a variant of Lepski's method.
This work offers a more flexible tool for training related models across various domains, with practical implications in genetic risk prediction.
arXiv Detail & Related papers (2023-07-05T15:55:23Z) - RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning [53.52699766206808]
We propose Retrieval for In-Context Learning (RetICL), a learnable method for modeling and optimally selecting examples sequentially for in-context learning.
We evaluate RetICL on math word problem solving and scientific question answering tasks and show that it consistently outperforms or matches and learnable baselines.
arXiv Detail & Related papers (2023-05-23T20:15:56Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Compositional Exemplars for In-context Learning [21.961094715261133]
Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability.
We propose CEIL (Compositional Exemplars for In-context Learning) to model the interaction between the given input and in-context examples.
We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing.
arXiv Detail & Related papers (2023-02-11T14:02:08Z) - An Information-Theoretic Approach for Estimating Scenario Generalization
in Crowd Motion Prediction [27.10815774845461]
We propose a novel scoring method, which characterizes generalization of models trained on source crowd scenarios and applied to target crowd scenarios.
The Interaction component aims to characterize the difficulty of scenario domains, while the diversity of a scenario domain is captured in the Diversity score.
Our experimental results validate the efficacy of the proposed method on several simulated and real-world (source,target) generalization tasks.
arXiv Detail & Related papers (2022-11-02T01:39:30Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.