EEE-QA: Exploring Effective and Efficient Question-Answer
Representations
- URL: http://arxiv.org/abs/2403.02176v1
- Date: Mon, 4 Mar 2024 16:21:13 GMT
- Title: EEE-QA: Exploring Effective and Efficient Question-Answer
Representations
- Authors: Zhanghao Hu, Yijun Yang, Junjie Xu, Yifu Qiu, Pinzhen Chen
- Abstract summary: Current approaches to question answering rely on pre-trained language models (PLMs) like RoBERTa.
This work challenges the existing question-answer encoding convention and explores finer representations.
- Score: 7.764629726412793
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current approaches to question answering rely on pre-trained language models
(PLMs) like RoBERTa. This work challenges the existing question-answer encoding
convention and explores finer representations. We begin with testing various
pooling methods compared to using the begin-of-sentence token as a question
representation for better quality. Next, we explore opportunities to
simultaneously embed all answer candidates with the question. This enables
cross-reference between answer choices and improves inference throughput via
reduced memory usage. Despite their simplicity and effectiveness, these methods
have yet to be widely studied in current frameworks. We experiment with
different PLMs, and with and without the integration of knowledge graphs.
Results prove that the memory efficacy of the proposed techniques with little
sacrifice in performance. Practically, our work enhances 38-100% throughput
with 26-65% speedups on consumer-grade GPUs by allowing for considerably larger
batch sizes. Our work sends a message to the community with promising
directions in both representation quality and efficiency for the
question-answering task in natural language processing.
Related papers
- It Is Not About What You Say, It Is About How You Say It: A Surprisingly Simple Approach for Improving Reading Comprehension [0.0]
Experimenting with 9 large language models across 3 datasets, we find that presenting the context before the question improves model performance.
We additionally find that the best method is surprisingly simple: it only requires concatenating a few tokens to the input and results in an accuracy improvement of up to $36%$.
arXiv Detail & Related papers (2024-06-24T16:43:11Z) - Active Learning with Task Adaptation Pre-training for Speech Emotion Recognition [17.59356583727259]
Speech emotion recognition (SER) has garnered increasing attention due to its wide range of applications.
We propose an active learning (AL)-based fine-tuning framework for SER, called textscAfter.
Our proposed method improves accuracy by 8.45% and reduces time consumption by 79%.
arXiv Detail & Related papers (2024-05-01T04:05:29Z) - Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual
Question Answering [32.21000330743921]
We propose a novel framework that endows the model with capabilities of answering more general questions.
Specifically, a well-defined detector is adopted to predict image-question related relation phrases.
The optimal answer is predicted by choosing the supporting fact with the highest score.
arXiv Detail & Related papers (2023-12-20T02:35:18Z) - Answering Ambiguous Questions via Iterative Prompting [84.3426020642704]
In open-domain question answering, due to the ambiguity of questions, multiple plausible answers may exist.
One approach is to directly predict all valid answers, but this can struggle with balancing relevance and diversity.
We present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions.
arXiv Detail & Related papers (2023-07-08T04:32:17Z) - OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs.
Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z) - Active Prompting with Chain-of-Thought for Large Language Models [26.5029080638055]
This paper proposes a new method, Active-Prompt, to adapt large language models to different tasks.
By borrowing ideas from the related problem of uncertainty-based active learning, we introduce several metrics to characterize the uncertainty.
Experimental results demonstrate the superiority of our proposed method, achieving state-of-the-art on eight complex reasoning tasks.
arXiv Detail & Related papers (2023-02-23T18:58:59Z) - TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA)
In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge.
Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z) - Augmenting Pre-trained Language Models with QA-Memory for Open-Domain
Question Answering [38.071375112873675]
We propose a question-answer augmented encoder-decoder model and accompanying pretraining strategy.
This yields an end-to-end system that outperforms prior QA retrieval methods on single-hop QA tasks.
arXiv Detail & Related papers (2022-04-10T02:33:00Z) - Training Data is More Valuable than You Think: A Simple and Effective
Method by Retrieving from Training Data [82.92758444543689]
Retrieval-based methods have been shown to be effective in NLP tasks via introducing external knowledge.
Surprisingly, we found that REtrieving from the traINing datA (REINA) only can lead to significant gains on multiple NLG and NLU tasks.
Experimental results show that this simple method can achieve significantly better performance on a variety of NLU and NLG tasks.
arXiv Detail & Related papers (2022-03-16T17:37:27Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z) - A Mutual Information Maximization Approach for the Spurious Solution
Problem in Weakly Supervised Question Answering [60.768146126094955]
Weakly supervised question answering usually has only the final answers as supervision signals.
There may exist many spurious solutions that coincidentally derive the correct answer, but training on such solutions can hurt model performance.
We propose to explicitly exploit such semantic correlations by maximizing the mutual information between question-answer pairs and predicted solutions.
arXiv Detail & Related papers (2021-06-14T05:47:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.