Limitations of Language Models in Arithmetic and Symbolic Induction
- URL: http://arxiv.org/abs/2208.05051v1
- Date: Tue, 9 Aug 2022 21:47:01 GMT
- Title: Limitations of Language Models in Arithmetic and Symbolic Induction
- Authors: Jing Qian, Hong Wang, Zekun Li, Shiyang Li, Xifeng Yan
- Abstract summary: Large pretrained Language Models (LMs) can perform remarkably well on a range of Natural Language Processing (NLP) tasks.
We find that these models have limitations on certain basic symbolic manipulation tasks such as copy, reverse, and addition.
We investigate the potential causes behind this phenomenon and examine a set of possible methods, including explicit positional markers, fine-grained computation steps, and LMs with callable programs.
- Score: 20.49118435604774
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work has shown that large pretrained Language Models (LMs) can not
only perform remarkably well on a range of Natural Language Processing (NLP)
tasks but also start improving on reasoning tasks such as arithmetic induction,
symbolic manipulation, and commonsense reasoning with increasing size of
models. However, it is still unclear what the underlying capabilities of these
LMs are. Surprisingly, we find that these models have limitations on certain
basic symbolic manipulation tasks such as copy, reverse, and addition. When the
total number of symbols or repeating symbols increases, the model performance
drops quickly. We investigate the potential causes behind this phenomenon and
examine a set of possible methods, including explicit positional markers,
fine-grained computation steps, and LMs with callable programs. Experimental
results show that none of these techniques can solve the simplest addition
induction problem completely. In the end, we introduce LMs with tutor, which
demonstrates every single step of teaching. LMs with tutor is able to deliver
100% accuracy in situations of OOD and repeating symbols, shedding new insights
on the boundary of large LMs in induction.
Related papers
- Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures [3.181878085746691]
Large language models (LLMs) have demonstrated remarkable mathematical capabilities, largely driven by chain-of-thought (CoT) prompting.
We propose that LLMs learn arithmetic by capturing algebraic structures, such as emphCommutativity and emphIdentity properties.
Our findings indicate that leveraging algebraic structures can enhance the LLMs' arithmetic capabilities, offering insights into improving their arithmetic performance.
arXiv Detail & Related papers (2024-11-25T10:23:11Z) - Interpreting and Improving Large Language Models in Arithmetic Calculation [72.19753146621429]
Large language models (LLMs) have demonstrated remarkable potential across numerous applications.
In this work, we delve into uncovering a specific mechanism by which LLMs execute calculations.
We investigate the potential benefits of selectively fine-tuning these essential heads/MLPs to boost the LLMs' computational performance.
arXiv Detail & Related papers (2024-09-03T07:01:46Z) - An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models [99.31449616860291]
Modern language models (LMs) can learn to perform new tasks in different ways.
In instruction following, the target task is described explicitly in natural language; in few-shot prompting, the task is specified implicitly.
In instruction inference, LMs are presented with in-context examples and are then prompted to generate a natural language task description.
arXiv Detail & Related papers (2024-04-03T19:31:56Z) - Reverse That Number! Decoding Order Matters in Arithmetic Learning [49.5504492920404]
Our work introduces a novel strategy that reevaluates the digit order by prioritizing output from the least significant digit.
Compared to the previous state-of-the-art (SOTA) method, our findings reveal an overall improvement of in accuracy while requiring only a third of the tokens typically used during training.
arXiv Detail & Related papers (2024-03-09T09:04:53Z) - Frugal LMs Trained to Invoke Symbolic Solvers Achieve
Parameter-Efficient Arithmetic Reasoning [36.8749786658624]
Large Language Models (LLM) exhibit zero-shot mathematical reasoning capacity as a behavior emergent with scale.
We show that small LMs can achieve reasonable arithmetic reasoning if arithmetic word problems are posed as a formalize-then-solve task.
arXiv Detail & Related papers (2023-12-09T13:20:49Z) - Learning Multi-Step Reasoning by Solving Arithmetic Tasks [6.398022050054328]
This work investigates how to incorporate relatively small Language Models with the capabilities of multi-step reasoning.
We propose to inject such abilities by continually pre-training LMs on a synthetic dataset MsAT.
Our experiments on four math word problem datasets show the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-06-02T17:29:22Z) - Augmented Language Models: a Survey [55.965967655575454]
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools.
We refer to them as Augmented Language Models (ALMs)
The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks.
arXiv Detail & Related papers (2023-02-15T18:25:52Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z) - oLMpics -- On what Language Model Pre-training Captures [84.60594612120173]
We propose eight reasoning tasks, which require operations such as comparison, conjunction, and composition.
A fundamental challenge is to understand whether the performance of a LM on a task should be attributed to the pre-trained representations or to the process of fine-tuning on the task data.
arXiv Detail & Related papers (2019-12-31T12:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.