Related papers: Case-Based or Rule-Based: How Do Transformers Do the Math?

Case-Based or Rule-Based: How Do Transformers Do the Math?

URL: http://arxiv.org/abs/2402.17709v2
Date: Wed, 26 Jun 2024 09:25:07 GMT
Title: Case-Based or Rule-Based: How Do Transformers Do the Math?
Authors: Yi Hu, Xiaojuan Tang, Haotong Yang, Muhan Zhang,
Abstract summary: We study whether transformers use rule-based or case-based reasoning for math problems. We provide explicit rules in the input and then instruct transformers to recite and follow the rules step by step. The significant improvement demonstrates that teaching LLMs to use rules explicitly helps them learn rule-based reasoning and generalize better in length.
Score: 24.17722967327729
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the impressive performance in a variety of complex tasks, modern large language models (LLMs) still have trouble dealing with some math problems that are simple and intuitive for humans, such as addition. While we can easily learn basic rules of addition and apply them to new problems of any length, LLMs struggle to do the same. Instead, they may rely on similar cases seen in the training corpus for help. We define these two different reasoning mechanisms as "rule-based reasoning" and "case-based reasoning". Since rule-based reasoning is essential for acquiring systematic generalization ability, we aim to explore exactly whether transformers use rule-based or case-based reasoning for math problems. Through carefully designed intervention experiments on five math tasks, we confirm that transformers are performing case-based reasoning, no matter whether scratchpad is used, which aligns with the previous observations that transformers use subgraph matching/shortcut learning to reason. To mitigate such problems, we propose a Rule-Following Fine-Tuning (RFFT) technique to teach transformers to perform rule-based reasoning. Specifically, we provide explicit rules in the input and then instruct transformers to recite and follow the rules step by step. Through RFFT, we successfully enable LLMs fine-tuned on 1-5 digit addition to generalize to up to 12-digit addition with over 95% accuracy, which is over 40% higher than scratchpad. The significant improvement demonstrates that teaching LLMs to use rules explicitly helps them learn rule-based reasoning and generalize better in length.

Related papers

Training Large Language Models to be Better Rule Followers [23.958458849973248]
Large language models (LLMs) have shown impressive performance across a wide range of tasks. Current training methods fail to leverage these rules effectively. We propose Meta Rule-Following Fine-Tuning (Meta-RFFT) to enhance the cross-task transferability of rule-following abilities.
arXiv Detail & Related papers (2025-02-17T07:54:50Z)
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios [58.90106984375913]
RuleArena is a novel and challenging benchmark designed to evaluate the ability of large language models (LLMs) to follow complex, real-world rules in reasoning. Covering three practical domains -- airline baggage fees, NBA transactions, and tax regulations -- RuleArena assesses LLMs' proficiency in handling intricate natural language instructions.
arXiv Detail & Related papers (2024-12-12T06:08:46Z)
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference [20.057611113206324]
We formalize rule-following as inference in propositional Horn logic. We prove that although small transformers can faithfully follow such rules, maliciously crafted prompts can still mislead both theoretical constructions and models learned from data.
arXiv Detail & Related papers (2024-06-21T19:18:16Z)
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers [68.77382332826167]
Large language models (LLMs) have achieved impressive performance across various mathematical reasoning benchmarks. One essential and frequently occurring evidence is that when the math questions are slightly changed, LLMs can behave incorrectly. This motivates us to evaluate the robustness of LLMs' math reasoning capability by testing a wide range of question variations.
arXiv Detail & Related papers (2024-02-29T15:26:14Z)
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs [87.34281749422756]
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. We propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic.
arXiv Detail & Related papers (2024-02-18T03:38:51Z)
LogicAsker: Evaluating and Improving the Logical Reasoning Ability of Large Language Models [63.14196038655506]
We introduce LogicAsker, a novel approach for evaluating and enhancing the logical reasoning capabilities of large language models (LLMs) Our methodology reveals significant gaps in LLMs' learning of logical rules, with identified reasoning failures ranging from 29% to 90% across different models. We leverage these findings to construct targeted demonstration examples and fine-tune data, notably enhancing logical reasoning in models like GPT-4o by up to 5%.
arXiv Detail & Related papers (2024-01-01T13:53:53Z)
Large Language Models can Learn Rules [106.40747309894236]
We present Hypotheses-to-Theories (HtT), a framework that learns a rule library for reasoning with large language models (LLMs) Experiments on relational reasoning, numerical reasoning and concept learning problems show that HtT improves existing prompting methods. The learned rules are also transferable to different models and to different forms of the same problem.
arXiv Detail & Related papers (2023-10-10T23:07:01Z)
Learning Reliable Logical Rules with SATNet [7.951021955925275]
We build on SATNet, a differentiable MaxSAT solver that learns the underlying rules from input-output examples. We introduce several effective verification techniques to validate it against the ground truth rules. Experiments on stream transformations and Sudoku problems show that our decoded rules are highly reliable.
arXiv Detail & Related papers (2023-10-03T15:14:28Z)
ChatRule: Mining Logical Rules with Large Language Models for Knowledge Graph Reasoning [107.61997887260056]
We propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs. Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs. To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs.
arXiv Detail & Related papers (2023-09-04T11:38:02Z)
Automating Defeasible Reasoning in Law [0.0]
We study defeasible reasoning in rule-based systems, in particular about legal norms and contracts. We identify rule modifier that specify how rules interact and how they can be overridden. We then define rule transformations that eliminate these modifier, leading to a translation of rules to formulas.
arXiv Detail & Related papers (2022-05-15T17:14:15Z)
RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs [91.71504177786792]
This paper studies learning logic rules for reasoning on knowledge graphs. Logic rules provide interpretable explanations when used for prediction as well as being able to generalize to other tasks. Existing methods either suffer from the problem of searching in a large search space or ineffective optimization due to sparse rewards.
arXiv Detail & Related papers (2020-10-08T14:47:02Z)
Transformers as Soft Reasoners over Language [33.291806251021185]
This paper investigates a problem where the facts and rules are provided as natural language sentences, thus bypassing a formal representation. We train transformers to emulate reason (or reasoning) over these sentences using synthetically generated data. Our models, that we call RuleTakers, provide the first empirical demonstration that this kind of soft reasoning over language is learnable.
arXiv Detail & Related papers (2020-02-14T04:23:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.