ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
- URL: http://arxiv.org/abs/2312.10003v1
- Date: Fri, 15 Dec 2023 18:20:15 GMT
- Title: ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
- Authors: Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila
Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh
Srinivasan, Manzil Zaheer, Felix Yu, Sanjiv Kumar
- Abstract summary: We develop a ReAct-style LLM agent with the ability to reason and act upon external knowledge.
We refine the agent through a ReST-like method that iteratively trains on previous trajectories.
Starting from a prompted large model and after just two iterations of the algorithm, we can produce a fine-tuned small model.
- Score: 50.508669199496474
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Answering complex natural language questions often necessitates multi-step
reasoning and integrating external information. Several systems have combined
knowledge retrieval with a large language model (LLM) to answer such questions.
These systems, however, suffer from various failure cases, and we cannot
directly train them end-to-end to fix such failures, as interaction with
external knowledge is non-differentiable. To address these deficiencies, we
define a ReAct-style LLM agent with the ability to reason and act upon external
knowledge. We further refine the agent through a ReST-like method that
iteratively trains on previous trajectories, employing growing-batch
reinforcement learning with AI feedback for continuous self-improvement and
self-distillation. Starting from a prompted large model and after just two
iterations of the algorithm, we can produce a fine-tuned small model that
achieves comparable performance on challenging compositional question-answering
benchmarks with two orders of magnitude fewer parameters.
Related papers
- Disentangling Memory and Reasoning Ability in Large Language Models [97.26827060106581]
We propose a new inference paradigm that decomposes the complex inference process into two distinct and clear actions.
Our experiment results show that this decomposition improves model performance and enhances the interpretability of the inference process.
arXiv Detail & Related papers (2024-11-20T17:55:38Z) - RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs.
In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z) - RAG-Modulo: Solving Sequential Tasks using Experience, Critics, and Language Models [5.0741409008225755]
Large language models (LLMs) have emerged as promising tools for solving challenging robotic tasks.
Most existing LLM-based agents lack the ability to retain and learn from past interactions.
We propose RAG-Modulo, a framework that enhances LLM-based agents with a memory of past interactions and incorporates critics to evaluate the agents' decisions.
arXiv Detail & Related papers (2024-09-18T20:03:32Z) - Recursive Introspection: Teaching Language Model Agents How to Self-Improve [30.086494067593268]
We develop RISE: Recursive IntroSpEction, an approach for fine-tuning large language models.
Our experiments show that RISE enables Llama2, Llama3, and Mistral models to improve themselves with more turns on math reasoning tasks.
arXiv Detail & Related papers (2024-07-25T17:35:59Z) - MrRank: Improving Question Answering Retrieval System through Multi-Result Ranking Model [4.173772253427094]
Large Language Models (LLMs) often struggle with hallucinations and outdated information.
To address this, Information Retrieval (IR) systems can be employed to augment LLMs with up-to-date knowledge.
We propose an approach that leverages learning-to-rank techniques to combine heterogeneous IR systems.
arXiv Detail & Related papers (2024-06-09T11:00:01Z) - UniRQR: A Unified Model for Retrieval Decision, Query, and Response
Generation in Internet-Based Knowledge Dialogue Systems [8.724141214921314]
Knowledge-based dialogue systems with internet retrieval can be typically segmented into three tasks: Retrieval Decision, Query Generation, and Response Generation.
Our work addresses this oversight by employing a single unified model facilitated by prompt and multi-task learning approaches.
By integrating these functions, our system leverages the full potential of pre-trained models and reduces the complexity and costs associated with deploying multiple models.
arXiv Detail & Related papers (2024-01-11T06:09:15Z) - R-Tuning: Instructing Large Language Models to Say `I Don't Know' [66.11375475253007]
Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges.
Previous instruction tuning methods force the model to complete a sentence no matter whether the model knows the knowledge or not.
We present a new approach called Refusal-Aware Instruction Tuning (R-Tuning)
Experimental results demonstrate R-Tuning effectively improves a model's ability to answer known questions and refrain from answering unknown questions.
arXiv Detail & Related papers (2023-11-16T08:45:44Z) - Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-Reflection [74.51523859064802]
We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG)
Self-RAG enhances an LM's quality and factuality through retrieval and self-reflection.
It significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks.
arXiv Detail & Related papers (2023-10-17T18:18:32Z) - Rethinking with Retrieval: Faithful Large Language Model Inference [91.66406351103484]
We propose a novel post-processing approach, rethinking with retrieval (RR)
RR retrieves relevant external knowledge based on the reasoning steps obtained from the chain-of-thought prompting.
We evaluate the effectiveness of RR through extensive experiments with GPT-3 on three complex reasoning tasks.
arXiv Detail & Related papers (2022-12-31T22:35:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.