Frugal LMs Trained to Invoke Symbolic Solvers Achieve
Parameter-Efficient Arithmetic Reasoning
- URL: http://arxiv.org/abs/2312.05571v2
- Date: Tue, 19 Dec 2023 17:48:20 GMT
- Title: Frugal LMs Trained to Invoke Symbolic Solvers Achieve
Parameter-Efficient Arithmetic Reasoning
- Authors: Subhabrata Dutta, Joykirat Singh, Ishan Pandey, Sunny Manchanda,
Soumen Chakrabarti, Tanmoy Chakraborty
- Abstract summary: Large Language Models (LLM) exhibit zero-shot mathematical reasoning capacity as a behavior emergent with scale.
We show that small LMs can achieve reasonable arithmetic reasoning if arithmetic word problems are posed as a formalize-then-solve task.
- Score: 36.8749786658624
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLM) exhibit zero-shot mathematical reasoning capacity
as a behavior emergent with scale, commonly manifesting as chain-of-thoughts
(CoT) reasoning. However, multiple empirical findings suggest that this prowess
is exclusive to LLMs with exorbitant sizes (beyond 50 billion parameters).
Meanwhile, educational neuroscientists suggest that symbolic algebraic
manipulation be introduced around the same time as arithmetic word problems to
modularize language-to-formulation, symbolic manipulation of the formulation,
and endgame arithmetic. In this paper, we start with the hypothesis that much
smaller LMs, which are weak at multi-step reasoning, can achieve reasonable
arithmetic reasoning if arithmetic word problems are posed as a
formalize-then-solve task. In our architecture, which we call SYRELM, the LM
serves the role of a translator to map natural language arithmetic questions
into a formal language (FL) description. A symbolic solver then evaluates the
FL expression to obtain the answer. A small frozen LM, equipped with an
efficient low-rank adapter, is capable of generating FL expressions that
incorporate natural language descriptions of the arithmetic problem (e.g.,
variable names and their purposes, formal expressions combining variables,
etc.). We adopt policy-gradient reinforcement learning to train the adapted LM,
informed by the non-differentiable symbolic solver. This marks a sharp
departure from the recent development in tool-augmented LLMs, in which the
external tools (e.g., calculator, Web search, etc.) are essentially detached
from the learning phase of the LM. SYRELM shows massive improvements (e.g.,
+30.65 absolute point improvement in accuracy on the SVAMP dataset using GPT-J
6B model) over base LMs, while keeping our testbed easy to diagnose, interpret
and within reach of most researchers.
Related papers
- Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures [3.181878085746691]
Large language models (LLMs) have demonstrated remarkable mathematical capabilities, largely driven by chain-of-thought (CoT) prompting.
We propose that LLMs learn arithmetic by capturing algebraic structures, such as emphCommutativity and emphIdentity properties.
Our findings indicate that leveraging algebraic structures can enhance the LLMs' arithmetic capabilities, offering insights into improving their arithmetic performance.
arXiv Detail & Related papers (2024-11-25T10:23:11Z) - Language Models are Symbolic Learners in Arithmetic [8.34588487873447]
Large Language Models (LLMs) are thought to struggle with arithmetic learning due to inherent differences between language modeling and numerical computation.
We first investigate whether LLMs leverage partial products during arithmetic learning.
We find that although LLMs can identify some partial products after learning, they fail to leverage them for arithmetic tasks, conversely.
arXiv Detail & Related papers (2024-10-21T01:57:16Z) - GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers [68.77382332826167]
Large language models (LLMs) have achieved impressive performance across various mathematical reasoning benchmarks.
One essential and frequently occurring evidence is that when the math questions are slightly changed, LLMs can behave incorrectly.
This motivates us to evaluate the robustness of LLMs' math reasoning capability by testing a wide range of question variations.
arXiv Detail & Related papers (2024-02-29T15:26:14Z) - Language Models can be Logical Solvers [99.40649402395725]
We introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers.
LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers.
arXiv Detail & Related papers (2023-11-10T16:23:50Z) - Neuro Symbolic Reasoning for Planning: Counterexample Guided Inductive
Synthesis using Large Language Models and Satisfiability Solving [23.426866969743525]
Generative large language models (LLMs) with instruct training can generate human-like responses to prompts.
Despite their improved accuracy, these models are still known to produce factually incorrect or contextually inappropriate results.
This limitation makes it difficult to use these models to synthesize formal artifacts that are used in safety-critical applications.
arXiv Detail & Related papers (2023-09-28T13:40:50Z) - Learning Multi-Step Reasoning by Solving Arithmetic Tasks [6.398022050054328]
This work investigates how to incorporate relatively small Language Models with the capabilities of multi-step reasoning.
We propose to inject such abilities by continually pre-training LMs on a synthetic dataset MsAT.
Our experiments on four math word problem datasets show the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-06-02T17:29:22Z) - SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs)
We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer.
We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z) - Augmented Language Models: a Survey [55.965967655575454]
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools.
We refer to them as Augmented Language Models (ALMs)
The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks.
arXiv Detail & Related papers (2023-02-15T18:25:52Z) - PAL: Program-aided Language Models [112.94785609781503]
We present Program-Aided Language models (PaL) to understand natural language problems.
PaL offloads the solution step to a programmatic runtime such as a Python interpreter.
We set new state-of-the-art results in all 12 benchmarks.
arXiv Detail & Related papers (2022-11-18T18:56:13Z) - Limitations of Language Models in Arithmetic and Symbolic Induction [20.49118435604774]
Large pretrained Language Models (LMs) can perform remarkably well on a range of Natural Language Processing (NLP) tasks.
We find that these models have limitations on certain basic symbolic manipulation tasks such as copy, reverse, and addition.
We investigate the potential causes behind this phenomenon and examine a set of possible methods, including explicit positional markers, fine-grained computation steps, and LMs with callable programs.
arXiv Detail & Related papers (2022-08-09T21:47:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.