Small Language Models Fine-tuned to Coordinate Larger Language Models
improve Complex Reasoning
- URL: http://arxiv.org/abs/2310.18338v2
- Date: Tue, 27 Feb 2024 13:24:06 GMT
- Title: Small Language Models Fine-tuned to Coordinate Larger Language Models
improve Complex Reasoning
- Authors: Gurusha Juneja, Subhabrata Dutta, Soumen Chakrabarti, Sunny Manchanda,
Tanmoy Chakraborty
- Abstract summary: Large Language Models (LLMs) prompted to generate chain-of-thought exhibit impressive reasoning capabilities.
We introduce DaSLaM, which uses a decomposition generator to decompose complex problems into subproblems that require fewer reasoning steps.
We show that DaSLaM is not limited by the solver's capabilities as a function of scale.
- Score: 41.03267013352519
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) prompted to generate chain-of-thought (CoT)
exhibit impressive reasoning capabilities. Recent attempts at prompt
decomposition toward solving complex, multi-step reasoning problems depend on
the ability of the LLM to simultaneously decompose and solve the problem. A
significant disadvantage is that foundational LLMs are typically not available
for fine-tuning, making adaptation computationally prohibitive. We believe (and
demonstrate) that problem decomposition and solution generation are distinct
capabilites, better addressed in separate modules, than by one monolithic LLM.
We introduce DaSLaM, which uses a decomposition generator to decompose complex
problems into subproblems that require fewer reasoning steps. These subproblems
are answered by a solver. We use a relatively small (13B parameters) LM as the
decomposition generator, which we train using policy gradient optimization to
interact with a solver LM (regarded as black-box) and guide it through
subproblems, thereby rendering our method solver-agnostic. Evaluation on
multiple different reasoning datasets reveal that with our method, a 175
billion parameter LM (text-davinci-003) can produce competitive or even better
performance, compared to its orders-of-magnitude larger successor, GPT-4.
Additionally, we show that DaSLaM is not limited by the solver's capabilities
as a function of scale; e.g., solver LMs with diverse sizes give significant
performance improvement with our solver-agnostic decomposition technique.
Exhaustive ablation studies evince the superiority of our modular finetuning
technique over exorbitantly large decomposer LLMs, based on prompting alone.
Related papers
- OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling [62.19438812624467]
Large language models (LLMs) have exhibited their problem-solving abilities in mathematical reasoning.
We propose OptiBench, a benchmark for End-to-end optimization problem-solving with human-readable inputs and outputs.
arXiv Detail & Related papers (2024-07-13T13:27:57Z) - Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization [0.6445087473595953]
Large language models (LLMs) demonstrate outstanding performance in various tasks in machine learning.
deploying LLM inference poses challenges due to the high compute and memory requirements.
We present Tender, an algorithm-hardware co-design solution that enables efficient deployment of LLM inference at low precision.
arXiv Detail & Related papers (2024-06-16T09:51:55Z) - Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models [79.46938238953916]
Fine-tuning large language models (LLMs) to diverse applications is crucial to meet complex demands.
Recent studies suggest decomposing a fine-tuned LLM into a base model and corresponding delta weights, which are then compressed using low-rank or low-bit approaches to reduce costs.
In this work, we observe that existing low-rank and low-bit compression methods can significantly harm the model performance for task-specific fine-tuned LLMs.
arXiv Detail & Related papers (2024-06-13T07:57:27Z) - $\texttt{LM}^\texttt{2}$: A Simple Society of Language Models Solves Complex Reasoning [22.810441504080703]
Large Language Models (LLMS) often lose track of complex, multi-step reasoning.
This paper proposes LM2 to address these challenges.
LM2 modularizes the decomposition, solution, and verification into three different language models.
arXiv Detail & Related papers (2024-04-02T19:23:10Z) - Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems [76.69936664916061]
We study how the number of LM calls affects the performance of Vote and Filter-Vote.
We find, surprisingly, that across multiple language tasks, the performance of both Vote and Filter-Vote can first increase but then decrease as a function of the number of LM calls.
arXiv Detail & Related papers (2024-03-04T19:12:48Z) - Divide-or-Conquer? Which Part Should You Distill Your LLM? [38.62667131299918]
We devise a similar strategy that breaks down reasoning tasks into a problem decomposition phase and a problem solving phase.
We show that the strategy is able to outperform a single stage solution.
arXiv Detail & Related papers (2024-02-22T22:28:46Z) - Frugal LMs Trained to Invoke Symbolic Solvers Achieve
Parameter-Efficient Arithmetic Reasoning [36.8749786658624]
Large Language Models (LLM) exhibit zero-shot mathematical reasoning capacity as a behavior emergent with scale.
We show that small LMs can achieve reasonable arithmetic reasoning if arithmetic word problems are posed as a formalize-then-solve task.
arXiv Detail & Related papers (2023-12-09T13:20:49Z) - Faith and Fate: Limits of Transformers on Compositionality [109.79516190693415]
We investigate the limits of transformer large language models across three representative compositional tasks.
These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer.
Our empirical findings suggest that transformer LLMs solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching.
arXiv Detail & Related papers (2023-05-29T23:24:14Z) - SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs)
We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer.
We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z) - Is a Question Decomposition Unit All We Need? [20.66688303609522]
We investigate if humans can decompose a hard question into a set of simpler questions that are relatively easier for models to solve.
We analyze a range of datasets involving various forms of reasoning and find that it is indeed possible to significantly improve model performance.
Our findings indicate that Human-in-the-loop Question Decomposition (HQD) can potentially provide an alternate path to building large LMs.
arXiv Detail & Related papers (2022-05-25T07:24:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.