Related papers: ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models

ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models

URL: http://arxiv.org/abs/2305.18323v1
Date: Tue, 23 May 2023 00:16:48 GMT
Title: ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models
Authors: Binfeng Xu, Zhiyuan Peng, Bowen Lei, Subhabrata Mukherjee, Yuchen Liu, Dongkuan Xu
Abstract summary: We propose a modular paradigm ReWOO that detaches the reasoning process from external observations, thus significantly reducing token consumption. We show that ReWOO achieves 5x token efficiency and 4% accuracy improvement on HotpotQA, a multi-step reasoning benchmark. Our illustrative work offloads reasoning ability from 175B GPT3.5 into 7B LLaMA, demonstrating the significant potential for truly efficient and scalable ALM systems.
Score: 32.95155349925248
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Augmented Language Models (ALMs) blend the reasoning capabilities of Large Language Models (LLMs) with tools that allow for knowledge retrieval and action execution. Existing ALM systems trigger LLM thought processes while pulling observations from these tools in an interleaved fashion. Specifically, an LLM reasons to call an external tool, gets halted to fetch the tool's response, and then decides the next action based on all preceding response tokens. Such a paradigm, though straightforward and easy to implement, often leads to huge computation complexity from redundant prompts and repeated execution. This study addresses such challenges for the first time, proposing a modular paradigm ReWOO (Reasoning WithOut Observation) that detaches the reasoning process from external observations, thus significantly reducing token consumption. Comprehensive evaluations across six public NLP benchmarks and a curated dataset reveal consistent performance enhancements with our proposed methodology. Notably, ReWOO achieves 5x token efficiency and 4% accuracy improvement on HotpotQA, a multi-step reasoning benchmark. Furthermore, ReWOO demonstrates robustness under tool-failure scenarios. Beyond prompt efficiency, decoupling parametric modules from non-parametric tool calls enables instruction fine-tuning to offload LLMs into smaller language models, thus substantially reducing model parameters. Our illustrative work offloads reasoning ability from 175B GPT3.5 into 7B LLaMA, demonstrating the significant potential for truly efficient and scalable ALM systems.

Related papers

The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models [69.798277882245]
We introduce Unsupervised Prefix Fine-Tuning (UPFT) to enhance large language models' reasoning efficiency. UPFT removes the need for labeled data or exhaustive sampling. Experiments show that UPFT matches the performance of supervised methods.
arXiv Detail & Related papers (2025-03-04T18:56:03Z)
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [70.91804882618243]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks. We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge. Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z)
Multi-Objective Large Language Model Unlearning [3.372396620898397]
Gradient Ascent (GA) is a proactive way to decrease the prediction probability of the model on the target data. We propose Multi-Objective Large Language Model Unlearning (MOLLM) algorithm to overcome gradient explosion and catastrophic forgetting. Our empirical results verify that MoLLM outperforms the SOTA GA-based LLM unlearning methods in terms of unlearning effect and model utility preservation.
arXiv Detail & Related papers (2024-12-29T09:35:56Z)
Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach [31.654345704242512]
This paper introduces a novel, model-level judge-free self-improvement framework. Our approach employs a controlled feedback mechanism while eliminating the need for MLLMs in the verification loop. We achieve superior precision and recall with significantly lower computational demands.
arXiv Detail & Related papers (2024-11-26T00:44:37Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z)
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization [0.6445087473595953]
Large language models (LLMs) demonstrate outstanding performance in various tasks in machine learning. deploying LLM inference poses challenges due to the high compute and memory requirements. We present Tender, an algorithm-hardware co-design solution that enables efficient deployment of LLM inference at low precision.
arXiv Detail & Related papers (2024-06-16T09:51:55Z)
SLMRec: Empowering Small Language Models for Sequential Recommendation [38.51895517016953]
Sequential Recommendation task involves predicting the next item a user is likely to interact with, given their past interactions. Recent research demonstrates the great impact of LLMs on sequential recommendation systems. Due to the huge size of LLMs, it is inefficient and impractical to apply a LLM-based model in real-world platforms.
arXiv Detail & Related papers (2024-05-28T07:12:06Z)
Efficient Tool Use with Chain-of-Abstraction Reasoning [65.18096363216574]
Large language models (LLMs) need to ground their reasoning to real-world knowledge. There remains challenges for fine-tuning LLM agents to invoke tools in multi-step reasoning problems. We propose a new method for LLMs to better leverage tools in multi-step reasoning.
arXiv Detail & Related papers (2024-01-30T21:53:30Z)
Enhancing Large Language Model Performance To Answer Questions and Extract Information More Accurately [2.1715455600756646]
Large Language Models (LLMs) generate responses to questions. Their effectiveness is often hindered by sub-optimal quality of answers and occasional failures to provide accurate responses to questions. To address these challenges, a fine-tuning process is employed, involving feedback and examples to refine models.
arXiv Detail & Related papers (2024-01-27T00:18:07Z)
CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark. In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship. We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z)
Improving Language Models via Plug-and-Play Retrieval Feedback [42.786225163763376]
Large language models (LLMs) exhibit remarkable performance across various NLP tasks. They often generate incorrect or hallucinated information, which hinders their practical applicability in real-world scenarios. We introduce ReFeed, a novel pipeline designed to enhance LLMs by providing automatic retrieval feedback in a plug-and-play framework.
arXiv Detail & Related papers (2023-05-23T12:29:44Z)
SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs) We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z)
Augmented Language Models: a Survey [55.965967655575454]
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. We refer to them as Augmented Language Models (ALMs) The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks.
arXiv Detail & Related papers (2023-02-15T18:25:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.