Related papers: RevOrder: A Novel Method for Enhanced Arithmetic in Language Models

RevOrder: A Novel Method for Enhanced Arithmetic in Language Models

URL: http://arxiv.org/abs/2402.03822v2
Date: Sat, 24 Feb 2024 01:11:14 GMT
Title: RevOrder: A Novel Method for Enhanced Arithmetic in Language Models
Authors: Si Shen, Peijun Shen, Danhao Zhu
Abstract summary: RevOrder reverses the output digits in addition, subtraction, and n-digit by 1-digit (nD by 1D) multiplication tasks. Our method significantly reduces the Count of Sequential Intermediate Digits (CSID) to $mathcalO(1)$.
Score: 0.9043578619916238
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents RevOrder, a novel technique aimed at improving arithmetic operations in large language models (LLMs) by reversing the output digits in addition, subtraction, and n-digit by 1-digit (nD by 1D) multiplication tasks. Our method significantly reduces the Count of Sequential Intermediate Digits (CSID) to $\mathcal{O}(1)$, a new metric we introduce to assess equation complexity. Through comprehensive testing, RevOrder not only achieves perfect accuracy in basic arithmetic operations but also substantially boosts LLM performance in division tasks, particularly with large numbers where traditional models struggle. Implementation of RevOrder is cost-effective for both training and inference phases. Moreover, applying RevOrder to fine-tune the LLaMA2-7B model on the GSM8K math task results in a considerable improvement, reducing equation calculation errors by 46% and increasing overall scores from 41.6 to 44.4.

Related papers

BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning [83.03531832811386]
BoostStep is a method that enhances reasoning accuracy through step-aligned ICL examples. It integrates seamlessly with chain-of-thought (CoT) and tree search algorithms. It improves DeepSeek-R1-671B's performance on AIME by 2.2%, leveraging simple examples only from the MATH dataset.
arXiv Detail & Related papers (2025-01-06T18:59:13Z)
IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently [17.525220958618988]
We introduce the Integrated Gated Calculator (IGC), a module that enables Large Language Models to perform arithmetic by emulating a calculator on the GPU. We finetune a Llama model with our module and test it on the BigBench Arithmetic benchmark, where it beats the State of the Art. Our approach takes only a single iteration to run and requires no external tools.
arXiv Detail & Related papers (2025-01-01T00:01:27Z)
An Early FIRST Reproduction and Improvements to Single-Token Decoding for Fast Listwise Reranking [50.81324768683995]
FIRST is a novel approach that integrates a learning-to-rank objective and leveraging the logits of only the first generated token. We extend the evaluation of FIRST to the TREC Deep Learning datasets (DL19-22), validating its robustness across diverse domains. Our experiments confirm that fast reranking with single-token logits does not compromise out-of-domain reranking quality.
arXiv Detail & Related papers (2024-11-08T12:08:17Z)
Enhancing Mathematical Reasoning in LLMs by Stepwise Correction [39.67266805233599]
Best-of-N decoding methods instruct large language models (LLMs) to generate multiple solutions, score each using a scoring function, and select the highest scored as the final answer to mathematical reasoning problems. We propose a novel prompting method named Stepwise Correction (StepCo) that helps LLMs identify and revise incorrect steps in their generated reasoning paths. The verify-then-revise process not only improves answer correctness but also reduces token consumption with fewer paths needed to generate.
arXiv Detail & Related papers (2024-10-16T18:18:42Z)
BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search [22.672130194493793]
Large Language Models (LLMs) have exhibited exceptional performance across a broad range of tasks and domains. They still encounter difficulties in solving mathematical problems due to the rigorous and logical nature of mathematics. We propose a novel approach, BEATS, to enhance mathematical problem-solving abilities.
arXiv Detail & Related papers (2024-09-26T15:47:42Z)
Improve Mathematical Reasoning in Language Models by Automated Process Supervision [22.72856086318912]
We propose a novel Monte Carlo Tree Search (MCTS) algorithm named textitOmegaPRM for the efficient collection of high-quality process supervision data. We are able to collect over 1.5 million process supervision annotations to train a Process Reward Model (PRM) We have enhanced the instruction tuned Gemini Pro model's math reasoning performance, achieving a 69.4% success rate on the MATH benchmark.
arXiv Detail & Related papers (2024-06-05T19:25:40Z)
Reverse That Number! Decoding Order Matters in Arithmetic Learning [49.5504492920404]
Our work introduces a novel strategy that reevaluates the digit order by prioritizing output from the least significant digit. Compared to the previous state-of-the-art (SOTA) method, our findings reveal an overall improvement of in accuracy while requiring only a third of the tokens typically used during training.
arXiv Detail & Related papers (2024-03-09T09:04:53Z)
Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning [53.2491163874712]
We use equations as IMRs to solve the numerical reasoning task. We present a method called Boosting Numerical Reasontextbfing by Decomposing the Generation of Equations (Bridge) Our method improves the performance by 2.2%, 0.9%, and 1.7% on GSM8K, SVAMP, and Algebra datasets.
arXiv Detail & Related papers (2023-08-21T09:35:33Z)
Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning [75.74103236299477]
Chain-of-thought prompting(CoT) and tool augmentation have been validated as effective practices for improving large language models. We propose a new approach that can deliberate the reasoning steps with tool interfaces, namely textbfDELI. Experimental results on CARP and six other datasets show that the proposed DELI mostly outperforms competitive baselines.
arXiv Detail & Related papers (2023-06-04T17:02:59Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Learning Division with Neural Arithmetic Logic Modules [2.019622939313173]
We show that robustly learning division in a systematic manner remains a challenge even at the simplest level of dividing two numbers. We propose two novel approaches for division which we call the Neural Reciprocal Unit (NRU) and the Neural Multiplicative Reciprocal Unit (NMRU)
arXiv Detail & Related papers (2021-10-11T11:56:57Z)
Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing [49.73889315176884]
We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions. We build the connections between the theory of approximate maximum inner product search and the regret analysis of reinforcement learning.
arXiv Detail & Related papers (2021-05-18T05:23:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.