Related papers: Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning

Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning

URL: http://arxiv.org/abs/2310.18338v2
Date: Tue, 27 Feb 2024 13:24:06 GMT
Title: Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning
Authors: Gurusha Juneja, Subhabrata Dutta, Soumen Chakrabarti, Sunny Manchanda, Tanmoy Chakraborty
Abstract summary: Large Language Models (LLMs) prompted to generate chain-of-thought exhibit impressive reasoning capabilities. We introduce DaSLaM, which uses a decomposition generator to decompose complex problems into subproblems that require fewer reasoning steps. We show that DaSLaM is not limited by the solver's capabilities as a function of scale.
Score: 41.03267013352519
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) prompted to generate chain-of-thought (CoT) exhibit impressive reasoning capabilities. Recent attempts at prompt decomposition toward solving complex, multi-step reasoning problems depend on the ability of the LLM to simultaneously decompose and solve the problem. A significant disadvantage is that foundational LLMs are typically not available for fine-tuning, making adaptation computationally prohibitive. We believe (and demonstrate) that problem decomposition and solution generation are distinct capabilites, better addressed in separate modules, than by one monolithic LLM. We introduce DaSLaM, which uses a decomposition generator to decompose complex problems into subproblems that require fewer reasoning steps. These subproblems are answered by a solver. We use a relatively small (13B parameters) LM as the decomposition generator, which we train using policy gradient optimization to interact with a solver LM (regarded as black-box) and guide it through subproblems, thereby rendering our method solver-agnostic. Evaluation on multiple different reasoning datasets reveal that with our method, a 175 billion parameter LM (text-davinci-003) can produce competitive or even better performance, compared to its orders-of-magnitude larger successor, GPT-4. Additionally, we show that DaSLaM is not limited by the solver's capabilities as a function of scale; e.g., solver LMs with diverse sizes give significant performance improvement with our solver-agnostic decomposition technique. Exhaustive ablation studies evince the superiority of our modular finetuning technique over exorbitantly large decomposer LLMs, based on prompting alone.

Related papers

Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution [59.39066657300045]
Chain-of-Thought (CoT) prompting enhances the reasoning of large language models (LLMs) by decomposing problems into sequential steps. We propose Syzygy of Thoughts (SoT)-a novel framework that extends CoT by introducing auxiliary, interrelated reasoning paths. SoT captures deeper logical dependencies, enabling more robust and structured problem-solving.
arXiv Detail & Related papers (2025-04-13T13:35:41Z)
Position: Scaling LLM Agents Requires Asymptotic Analysis with LLM Primitives [8.713076928533846]
Decomposing hard problems into subproblems often makes them easier and more efficient to solve. This paper argues that analysis with LLM primitives is needed to reason about the efficiency of such systems.
arXiv Detail & Related papers (2025-02-04T20:47:43Z)
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling [62.19438812624467]
Large language models (LLMs) have exhibited their problem-solving abilities in mathematical reasoning. We propose OptiBench, a benchmark for End-to-end optimization problem-solving with human-readable inputs and outputs.
arXiv Detail & Related papers (2024-07-13T13:27:57Z)
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization [0.6445087473595953]
Large language models (LLMs) demonstrate outstanding performance in various tasks in machine learning. deploying LLM inference poses challenges due to the high compute and memory requirements. We present Tender, an algorithm-hardware co-design solution that enables efficient deployment of LLM inference at low precision.
arXiv Detail & Related papers (2024-06-16T09:51:55Z)
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models [79.46938238953916]
Fine-tuning large language models (LLMs) to diverse applications is crucial to meet complex demands. Recent studies suggest decomposing a fine-tuned LLM into a base model and corresponding delta weights, which are then compressed using low-rank or low-bit approaches to reduce costs. In this work, we observe that existing low-rank and low-bit compression methods can significantly harm the model performance for task-specific fine-tuned LLMs.
arXiv Detail & Related papers (2024-06-13T07:57:27Z)
$\texttt{LM}^\texttt{2}$: A Simple Society of Language Models Solves Complex Reasoning [22.810441504080703]
Large Language Models (LLMS) often lose track of complex, multi-step reasoning. This paper proposes LM2 to address these challenges. LM2 modularizes the decomposition, solution, and verification into three different language models.
arXiv Detail & Related papers (2024-04-02T19:23:10Z)
Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems [76.69936664916061]
We study how the number of LM calls affects the performance of Vote and Filter-Vote. We find, surprisingly, that across multiple language tasks, the performance of both Vote and Filter-Vote can first increase but then decrease as a function of the number of LM calls.
arXiv Detail & Related papers (2024-03-04T19:12:48Z)
Divide-or-Conquer? Which Part Should You Distill Your LLM? [38.62667131299918]
We devise a similar strategy that breaks down reasoning tasks into a problem decomposition phase and a problem solving phase. We show that the strategy is able to outperform a single stage solution.
arXiv Detail & Related papers (2024-02-22T22:28:46Z)
Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning [36.8749786658624]
Large Language Models (LLM) exhibit zero-shot mathematical reasoning capacity as a behavior emergent with scale. We show that small LMs can achieve reasonable arithmetic reasoning if arithmetic word problems are posed as a formalize-then-solve task.
arXiv Detail & Related papers (2023-12-09T13:20:49Z)
Faith and Fate: Limits of Transformers on Compositionality [109.79516190693415]
We investigate the limits of transformer large language models across three representative compositional tasks. These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer. Our empirical findings suggest that transformer LLMs solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching.
arXiv Detail & Related papers (2023-05-29T23:24:14Z)
SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs) We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z)
Is a Question Decomposition Unit All We Need? [20.66688303609522]
We investigate if humans can decompose a hard question into a set of simpler questions that are relatively easier for models to solve. We analyze a range of datasets involving various forms of reasoning and find that it is indeed possible to significantly improve model performance. Our findings indicate that Human-in-the-loop Question Decomposition (HQD) can potentially provide an alternate path to building large LMs.
arXiv Detail & Related papers (2022-05-25T07:24:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.