EEE-QA: Exploring Effective and Efficient Question-Answer
Representations
- URL: http://arxiv.org/abs/2403.02176v1
- Date: Mon, 4 Mar 2024 16:21:13 GMT
- Title: EEE-QA: Exploring Effective and Efficient Question-Answer
Representations
- Authors: Zhanghao Hu, Yijun Yang, Junjie Xu, Yifu Qiu, Pinzhen Chen
- Abstract summary: Current approaches to question answering rely on pre-trained language models (PLMs) like RoBERTa.
This work challenges the existing question-answer encoding convention and explores finer representations.
- Score: 7.764629726412793
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current approaches to question answering rely on pre-trained language models
(PLMs) like RoBERTa. This work challenges the existing question-answer encoding
convention and explores finer representations. We begin with testing various
pooling methods compared to using the begin-of-sentence token as a question
representation for better quality. Next, we explore opportunities to
simultaneously embed all answer candidates with the question. This enables
cross-reference between answer choices and improves inference throughput via
reduced memory usage. Despite their simplicity and effectiveness, these methods
have yet to be widely studied in current frameworks. We experiment with
different PLMs, and with and without the integration of knowledge graphs.
Results prove that the memory efficacy of the proposed techniques with little
sacrifice in performance. Practically, our work enhances 38-100% throughput
with 26-65% speedups on consumer-grade GPUs by allowing for considerably larger
batch sizes. Our work sends a message to the community with promising
directions in both representation quality and efficiency for the
question-answering task in natural language processing.
Related papers
- Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance.
We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z) - EfficientEQA: An Efficient Approach for Open Vocabulary Embodied Question Answering [21.114403949257934]
Embodied Question Answering (EQA) is an essential yet challenging task for robotic home assistants.
Recent studies have shown that large vision-language models (VLMs) can be effectively utilized for EQA, but existing works either focus on video-based question answering or rely on closed-form choice sets.
We propose a novel framework called EfficientEQA for open-vocabulary EQA, which enables efficient exploration and accurate answering.
arXiv Detail & Related papers (2024-10-26T19:48:47Z) - Adapting Vision-Language Models to Open Classes via Test-Time Prompt Tuning [50.26965628047682]
Adapting pre-trained models to open classes is a challenging problem in machine learning.
In this paper, we consider combining the advantages of both and come up with a test-time prompt tuning approach.
Our proposed method outperforms all comparison methods on average considering both base and new classes.
arXiv Detail & Related papers (2024-08-29T12:34:01Z) - It Is Not About What You Say, It Is About How You Say It: A Surprisingly Simple Approach for Improving Reading Comprehension [0.0]
Experimenting with 9 large language models across 3 datasets, we find that presenting the context before the question improves model performance.
We additionally find that the best method is surprisingly simple: it only requires concatenating a few tokens to the input and results in an accuracy improvement of up to $36%$.
arXiv Detail & Related papers (2024-06-24T16:43:11Z) - Answering Ambiguous Questions via Iterative Prompting [84.3426020642704]
In open-domain question answering, due to the ambiguity of questions, multiple plausible answers may exist.
One approach is to directly predict all valid answers, but this can struggle with balancing relevance and diversity.
We present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions.
arXiv Detail & Related papers (2023-07-08T04:32:17Z) - Active Prompting with Chain-of-Thought for Large Language Models [26.5029080638055]
This paper proposes a new method, Active-Prompt, to adapt large language models to different tasks.
By borrowing ideas from the related problem of uncertainty-based active learning, we introduce several metrics to characterize the uncertainty.
Experimental results demonstrate the superiority of our proposed method, achieving state-of-the-art on eight complex reasoning tasks.
arXiv Detail & Related papers (2023-02-23T18:58:59Z) - TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA)
In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge.
Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z) - Augmenting Pre-trained Language Models with QA-Memory for Open-Domain
Question Answering [38.071375112873675]
We propose a question-answer augmented encoder-decoder model and accompanying pretraining strategy.
This yields an end-to-end system that outperforms prior QA retrieval methods on single-hop QA tasks.
arXiv Detail & Related papers (2022-04-10T02:33:00Z) - Training Data is More Valuable than You Think: A Simple and Effective
Method by Retrieving from Training Data [82.92758444543689]
Retrieval-based methods have been shown to be effective in NLP tasks via introducing external knowledge.
Surprisingly, we found that REtrieving from the traINing datA (REINA) only can lead to significant gains on multiple NLG and NLU tasks.
Experimental results show that this simple method can achieve significantly better performance on a variety of NLU and NLG tasks.
arXiv Detail & Related papers (2022-03-16T17:37:27Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.