Arithmetic Reasoning with LLM: Prolog Generation & Permutation
- URL: http://arxiv.org/abs/2405.17893v1
- Date: Tue, 28 May 2024 07:13:25 GMT
- Title: Arithmetic Reasoning with LLM: Prolog Generation & Permutation
- Authors: Xiaocheng Yang, Bingsen Chen, Yik-Cheung Tam,
- Abstract summary: We show that Prolog-based arithmetic problem-solving outperforms CoT generation in the GSM8K benchmark.
We propose to permute the ground truth predicates for more robust LLM training via data augmentation.
- Score: 2.1867261071129125
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Instructing large language models (LLMs) to solve elementary school math problems has shown great success using Chain of Thought (CoT). However, the CoT approach relies on an LLM to generate a sequence of arithmetic calculations which can be prone to cascaded calculation errors. We hypothesize that an LLM should focus on extracting predicates and generating symbolic formulas from the math problem description so that the underlying calculation can be done via an external code interpreter. We investigate using LLM to generate Prolog programs to solve mathematical questions. Experimental results show that our Prolog-based arithmetic problem-solving outperforms CoT generation in the GSM8K benchmark across three distinct LLMs. In addition, given the insensitive ordering of predicates and symbolic formulas in Prolog, we propose to permute the ground truth predicates for more robust LLM training via data augmentation.
Related papers
- Graph-Structured Speculative Decoding [52.94367724136063]
Speculative decoding has emerged as a promising technique to accelerate the inference of Large Language Models.
We introduce an innovative approach utilizing a directed acyclic graph (DAG) to manage the drafted hypotheses.
We observe a remarkable speedup of 1.73$times$ to 1.96$times$, significantly surpassing standard speculative decoding.
arXiv Detail & Related papers (2024-07-23T06:21:24Z) - On the Design and Analysis of LLM-Based Algorithms [74.7126776018275]
Large language models (LLMs) are used as sub-routines in algorithms.
LLMs have achieved remarkable empirical success.
Our framework holds promise for advancing LLM-based algorithms.
To promote further study of LLM-based algorithms, we release our source code at https://github.com/modelscope/agentscope/tree/main/examples/paper_llm_based_algorithm.
arXiv Detail & Related papers (2024-07-20T07:39:07Z) - OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step [7.7168728919692855]
Large Language Models (LLMs) still face challenges in accurately performing complex arithmetic operations.
We propose a framework that enables exact arithmetic in textita single autoregressive step, providing faster, more secure, and more interpretable LLM systems.
arXiv Detail & Related papers (2024-06-04T04:17:40Z) - Code Simulation Challenges for Large Language Models [6.970495767499435]
This work studies to what extent Large Language Models (LLMs) can simulate coding and algorithmic tasks.
We introduce benchmarks for straight-line programs, code that contains critical paths, and approximate and redundant instructions.
We propose a novel off-the-shelf prompting method, Chain of Simulation (CoSm), which instructs LLMs to simulate code execution line by line/follow the pattern of compilers.
arXiv Detail & Related papers (2024-01-17T09:23:59Z) - LPML: LLM-Prompting Markup Language for Mathematical Reasoning [8.995617701116142]
We propose a novel framework that integrates the Chain-of-Thought (CoT) method with an external tool (Python REPL)
Our approach enables LLMs to write the markup language and perform advanced mathematical reasoning using only zero-shot prompting.
arXiv Detail & Related papers (2023-09-21T02:46:20Z) - LaGR-SEQ: Language-Guided Reinforcement Learning with Sample-Efficient
Querying [71.86163159193327]
Large language models (LLMs) have recently demonstrated their impressive ability to provide context-aware responses via text.
This ability could potentially be used to predict plausible solutions in sequential decision making tasks pertaining to pattern completion.
We introduce LaGR, which uses this predictive ability of LLMs to propose solutions to tasks that have been partially completed by a primary reinforcement learning (RL) agent.
arXiv Detail & Related papers (2023-08-21T02:07:35Z) - Logic-LM: Empowering Large Language Models with Symbolic Solvers for
Faithful Logical Reasoning [101.26814728062065]
Large Language Models (LLMs) have shown human-like reasoning abilities but still struggle with complex logical problems.
This paper introduces a novel framework, Logic-LM, which integrates LLMs with symbolic solvers to improve logical problem-solving.
arXiv Detail & Related papers (2023-05-20T22:25:38Z) - SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs)
We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer.
We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z) - MathPrompter: Mathematical Reasoning using Large Language Models [7.953723258038284]
Large Language Models (LLMs) have limited performance when solving arithmetic reasoning tasks.
MathPrompter uses the Zero-shot chain-of-thought prompting technique to generate multiple Algebraic expressions or Python functions to solve the same math problem in different ways.
arXiv Detail & Related papers (2023-03-04T04:43:49Z) - PAL: Program-aided Language Models [112.94785609781503]
We present Program-Aided Language models (PaL) to understand natural language problems.
PaL offloads the solution step to a programmatic runtime such as a Python interpreter.
We set new state-of-the-art results in all 12 benchmarks.
arXiv Detail & Related papers (2022-11-18T18:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.