Answering Unseen Questions With Smaller Language Models Using Rationale
Generation and Dense Retrieval
- URL: http://arxiv.org/abs/2308.04711v3
- Date: Thu, 12 Oct 2023 21:25:15 GMT
- Title: Answering Unseen Questions With Smaller Language Models Using Rationale
Generation and Dense Retrieval
- Authors: Tim Hartill, Diana Benavides-Prado, Michael Witbrock, Patricia J.
Riddle
- Abstract summary: We evaluate two methods for further improvement in this setting.
Both focus on combining rationales generated by a larger Language Model with longer contexts created from a multi-hop dense retrieval system.
Our single best Reasoning model materially improves upon strong comparable prior baselines for unseen evaluation datasets.
- Score: 9.136948771060895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When provided with sufficient explanatory context, smaller Language Models
have been shown to exhibit strong reasoning ability on challenging short-answer
question-answering tasks where the questions are unseen in training. We
evaluate two methods for further improvement in this setting. Both methods
focus on combining rationales generated by a larger Language Model with longer
contexts created from a multi-hop dense retrieval system. The first method
($\textit{RR}$) involves training a Rationale Ranking model to score both
generated rationales and retrieved contexts with respect to relevance and
truthfulness. We then use the scores to derive combined contexts from both
knowledge sources using a number of combinatory strategies. For the second
method ($\textit{RATD}$) we utilise retrieval-augmented training datasets
developed by Hartill et al. 2023 to train a smaller Reasoning model such that
it becomes proficient at utilising relevant information from longer text
sequences that may be only partially evidential and frequently contain many
irrelevant sentences. We find that both methods significantly improve results.
Our single best Reasoning model materially improves upon strong comparable
prior baselines for unseen evaluation datasets (StrategyQA 58.9 $\rightarrow$
61.7 acc., CommonsenseQA 63.6 $\rightarrow$ 72.7 acc., ARC-DA 31.6
$\rightarrow$ 52.1 F1, IIRC 25.5 $\rightarrow$ 27.3 F1) and a version utilising
our prior knowledge of each type of question in selecting a context combination
strategy does even better. Our proposed models also generally outperform direct
prompts against much larger models (BLOOM 175B and StableVicuna 13B) in both
few-shot chain-of-thought and standard few-shot settings.
Related papers
- FLARE: Faithful Logic-Aided Reasoning and Exploration [50.9814063216852]
We introduce a novel approach for traversing the problem space using task decompositions.
We use the Large Language Models to plan a solution, soft-formalise the query into facts and predicates using a logic programming code.
Our method allows us to compute the faithfulness of the reasoning process w.r.t. the generated code and analyse the steps of the multi-hop search without relying on external solvers.
arXiv Detail & Related papers (2024-10-14T19:39:11Z) - Advancing LLM Reasoning Generalists with Preference Trees [119.57169648859707]
We introduce Eurus, a suite of large language models (LLMs) optimized for reasoning.
Eurus models achieve state-of-the-art results among open-source models on a diverse set of benchmarks.
arXiv Detail & Related papers (2024-04-02T16:25:30Z) - Teaching Smaller Language Models To Generalise To Unseen Compositional
Questions [6.9076450524134145]
We propose a combination of multitask pretraining on up to 93 tasks designed to instill diverse reasoning abilities.
We show that performance can be significantly improved by adding retrieval-augmented training datasets.
arXiv Detail & Related papers (2023-08-02T05:00:12Z) - Adapting Neural Link Predictors for Data-Efficient Complex Query
Answering [45.961111441411084]
We propose a parameter-efficient score emphadaptation model optimised to re-calibrate neural link prediction scores for the complex query answering task.
CQD$mathcalA$ produces significantly more accurate results than current state-of-the-art methods.
arXiv Detail & Related papers (2023-01-29T00:17:16Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Query Expansion Using Contextual Clue Sampling with Language Models [69.51976926838232]
We propose a combination of an effective filtering strategy and fusion of the retrieved documents based on the generation probability of each context.
Our lexical matching based approach achieves a similar top-5/top-20 retrieval accuracy and higher top-100 accuracy compared with the well-established dense retrieval model DPR.
For end-to-end QA, the reader model also benefits from our method and achieves the highest Exact-Match score against several competitive baselines.
arXiv Detail & Related papers (2022-10-13T15:18:04Z) - Teaching Broad Reasoning Skills via Decomposition-Guided Contexts [50.114651561111245]
Question-answering datasets require a broad set of reasoning skills.
We show how to use question decompositions to teach these broad reasoning skills in a robust fashion.
arXiv Detail & Related papers (2022-05-25T05:13:21Z) - STaR: Bootstrapping Reasoning With Reasoning [39.45372621632046]
"Self-Taught Reason" (STaR) relies on a simple loop: generate rationales to answer many questions, prompted with a few rationale examples.
We show that STaR significantly improves performance on multiple datasets compared to a model fine-tuned to directly predict final answers.
arXiv Detail & Related papers (2022-03-28T03:12:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.