Related papers: SEER : A Knapsack approach to Exemplar Selection for In-Context HybridQA

SEER : A Knapsack approach to Exemplar Selection for In-Context HybridQA

URL: http://arxiv.org/abs/2310.06675v2
Date: Fri, 20 Oct 2023 08:02:25 GMT
Title: SEER : A Knapsack approach to Exemplar Selection for In-Context HybridQA
Authors: Jonathan Tonglet, Manon Reusens, Philipp Borchert, Bart Baesens
Abstract summary: In this work, we present Selection of Exmplars for hybrid Reasoning (SEER), a novel method for selecting a set of exemplars that is both representative and diverse. The effectiveness of SEER is demonstrated on FinQA and TAT-QA, two real-world benchmarks for HybridQA, where it outperforms previous exemplar selection methods.
Score: 1.0323063834827413
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Question answering over hybrid contexts is a complex task, which requires the combination of information extracted from unstructured texts and structured tables in various ways. Recently, In-Context Learning demonstrated significant performance advances for reasoning tasks. In this paradigm, a large language model performs predictions based on a small set of supporting exemplars. The performance of In-Context Learning depends heavily on the selection procedure of the supporting exemplars, particularly in the case of HybridQA, where considering the diversity of reasoning chains and the large size of the hybrid contexts becomes crucial. In this work, we present Selection of ExEmplars for hybrid Reasoning (SEER), a novel method for selecting a set of exemplars that is both representative and diverse. The key novelty of SEER is that it formulates exemplar selection as a Knapsack Integer Linear Program. The Knapsack framework provides the flexibility to incorporate diversity constraints that prioritize exemplars with desirable attributes, and capacity constraints that ensure that the prompt size respects the provided capacity budgets. The effectiveness of SEER is demonstrated on FinQA and TAT-QA, two real-world benchmarks for HybridQA, where it outperforms previous exemplar selection methods.

Related papers

Context Attribution with Multi-Armed Bandit Optimization [11.715006981206844]
We propose a novel framework that formulates context attribution as a multi-armed bandit (CMAB) problem.<n>We employ Combinatorial Thompson Sampling (CTS) to efficiently explore the exponentially large space of context subsets under a limited query budget.<n>Our method defines a reward function based on normalized token likelihoods, capturing how well a subset of segments supports the original model response.
arXiv Detail & Related papers (2025-06-24T19:47:27Z)
Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative Contexts [67.67746334493302]
Large Language Models (LLMs) have demonstrated remarkable capabilities across numerous tasks, yet they often rely on external context to handle complex tasks. We propose a tri-encoder sequential retriever that models this process as a Markov Decision Process (MDP) We show that our method consistently and significantly outperforms baselines, underscoring the importance of explicitly modeling inter-example dependencies.
arXiv Detail & Related papers (2025-04-15T17:35:56Z)
STAYKATE: Hybrid In-Context Example Selection Combining Representativeness Sampling and Retrieval-based Approach -- A Case Study on Science Domains [0.8296121350988481]
Large language models (LLMs) demonstrate the ability to learn in-context, offering a potential solution for scientific information extraction. We propose STAYKATE, a static-dynamic hybrid selection method that combines the principles of representativeness sampling from active learning with the prevalent retrieval-based approach. The results across three domain-specific datasets indicate that STAYKATE outperforms both the traditional supervised methods and existing selection methods.
arXiv Detail & Related papers (2024-12-28T06:13:50Z)
The Power of Adaptation: Boosting In-Context Learning through Adaptive Prompting [8.260097638532878]
Large Language Models (LLMs) have demonstrated exceptional abilities across a broad range of language-related tasks. We propose textscAdaptive-Prompt, a novel method that adaptively selects exemplars by leveraging model feedback. Experimental results show that textscAdaptive-Prompt significantly enhances LLM performance across a variety of reasoning tasks.
arXiv Detail & Related papers (2024-12-23T15:49:43Z)
PromptRefine: Enhancing Few-Shot Performance on Low-Resource Indic Languages with Example Selection from Related Example Banks [57.86928556668849]
Large Language Models (LLMs) have recently demonstrated impressive few-shot learning capabilities through in-context learning (ICL) ICL performance is highly dependent on the choice of few-shot demonstrations, making the selection of the most optimal examples a persistent research challenge. In this work, we propose PromptRefine, a novel Alternating Minimization approach for example selection that improves ICL performance on low-resource Indic languages.
arXiv Detail & Related papers (2024-12-07T17:51:31Z)
Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation [60.493180081319785]
We propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step. Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.
arXiv Detail & Related papers (2024-08-24T14:14:32Z)
Sub-SA: Strengthen In-context Learning via Submodular Selective Annotation [4.846839863393725]
We propose Sub-SA (Submodular Selective ), a sub-module-based selective annotation method. The aim of Sub-SA is to reduce annotation costs while improving the quality of in-context examples. We also propose RPR (Reward and Penalty Regularization) to better balance the diversity and representativeness of the unlabeled dataset.
arXiv Detail & Related papers (2024-07-08T07:47:30Z)
Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars [66.823588073584]
Large language models (LLMs) have shown impressive capabilities in real-world applications. The quality of these exemplars in the prompt greatly impacts performance. Existing methods fail to adequately account for the impact of exemplar ordering on the performance.
arXiv Detail & Related papers (2024-05-25T08:23:05Z)
MCS-SQL: Leveraging Multiple Prompts and Multiple-Choice Selection For Text-to-SQL Generation [10.726734105960924]
Large language models (LLMs) have enabled in-context learning (ICL)-based methods that significantly outperform fine-tuning approaches for text-to- tasks. This study considers the sensitivity of LLMs to the prompts and introduces a novel approach that leverages multiple prompts to explore a broader search space for possible answers. We establish a new SOTA performance on the BIRD in terms of both the accuracy and efficiency of the generated queries.
arXiv Detail & Related papers (2024-05-13T04:59:32Z)
Adapting Pre-trained Generative Models for Extractive Question Answering [4.993041970406846]
We introduce a novel approach that uses the power of pre-trained generative models to address extractive QA tasks. We demonstrate the superior performance of our proposed approach compared to existing state-of-the-art models.
arXiv Detail & Related papers (2023-11-06T09:01:02Z)
Diversify Question Generation with Retrieval-Augmented Style Transfer [68.00794669873196]
We propose RAST, a framework for Retrieval-Augmented Style Transfer. The objective is to utilize the style of diverse templates for question generation. We develop a novel Reinforcement Learning (RL) based approach that maximizes a weighted combination of diversity reward and consistency reward.
arXiv Detail & Related papers (2023-10-23T02:27:31Z)
HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering [13.026990720973703]
In this paper, we introduce a new prompting strategy called Hybrid prompt strategy and Retrieval of Thought for TextTableQA. Our method achieves superior performance compared to the fully-supervised SOTA on the MultiHiertt dataset in the few-shot setting.
arXiv Detail & Related papers (2023-09-22T07:26:17Z)
MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization [55.60306377044225]
State-of-the-art summarization systems can generate highly fluent summaries. These summaries, however, may contain factual inconsistencies and/or information not present in the source. We introduce an alternative scheme based on standard information-theoretic measures in which the information present in the source and summary is directly compared.
arXiv Detail & Related papers (2023-01-28T23:08:25Z)
Reasoning over Hybrid Chain for Table-and-Text Open Domain QA [69.8436986668218]
We propose a ChAin-centric Reasoning and Pre-training framework (CARP) CARP utilizes hybrid chain to model the explicit intermediate reasoning process across table and text for question answering. We also propose a novel chain-centric pre-training method, to enhance the pre-trained model in identifying the cross-modality reasoning process.
arXiv Detail & Related papers (2022-01-15T16:11:55Z)
Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.