Related papers: Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning

Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning

URL: http://arxiv.org/abs/2308.10585v1
Date: Mon, 21 Aug 2023 09:35:33 GMT
Title: Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning
Authors: Dingzirui Wang, Longxu Dou, Wenbin Zhang, Junyu Zeng, Wanxiang Che
Abstract summary: We use equations as IMRs to solve the numerical reasoning task. We present a method called Boosting Numerical Reasontextbfing by Decomposing the Generation of Equations (Bridge) Our method improves the performance by 2.2%, 0.9%, and 1.7% on GSM8K, SVAMP, and Algebra datasets.
Score: 53.2491163874712
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Numerical reasoning is vital for natural language processing models to understand and process numerical information in real-world scenarios. Most current methods first generate the Intermediate Meaning Representations (IMRs) of questions and then generate answers. Current SOTA methods generate programs as IMRs with large language models (LLMs). Intuitively, equations have fewer restrictions and closer semantics to the question than programs, leading to higher generation accuracy. However, current LLMs generate equations worse than programs, where we assume that the equation data is rare in pre-training data compared to programs. So in this paper, we try to use equations as IMRs to solve the numerical reasoning task by addressing two problems: (1) Theoretically, how to prove that the equation is an IMR with higher generation accuracy than programs; (2) Empirically, how to improve the generation accuracy of equations with LLMs. For the first problem, we propose and prove a proposition to theoretically compare the generation accuracy of different IMRs. For the second problem, we present a method called Boosting Numerical Reason\textbfing by Decomposing the Generation of Equations (Bridge), which can improve the accuracy of LLMs in generating equations as IMRs by reducing the tendency of generating constant expressions and programs. Our method improves the performance by 2.2%, 0.9%, and 1.7% on GSM8K, SVAMP, and Algebra datasets compared to the previous state-of-the-art methods under the single reasoning path setting. Our codes and prompts are released in https://github.com/zirui-HIT/Bridge_for_Numerical_Reasoning.

Related papers

DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation [68.19756761027351]
Diffusion large language models (dLLMs) are compelling alternatives to autoregressive (AR) models.<n>We investigate their denoising processes and reinforcement learning methods.<n>Our work provides deeper insight into the machinery of dLLM generation and offers an effective, diffusion-native RL training framework.
arXiv Detail & Related papers (2025-06-25T17:35:47Z)
LLM4ED: Large Language Models for Automatic Equation Discovery [0.8644909837301149]
We introduce a new framework that utilizes natural language-based prompts to guide large language models in automatically mining governing equations from data. Specifically, we first utilize the generation capability of LLMs to generate diverse equations in string form, and then evaluate the generated equations based on observations. Experiments are extensively conducted on both partial differential equations and ordinary differential equations.
arXiv Detail & Related papers (2024-05-13T14:03:49Z)
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models [17.64574496035502]
Traditional methods of equation discovery, known as symbolic regression, largely focus on extracting equations from data alone. We introduce LLM-SR, a novel approach that leverages the scientific knowledge and robust code generation capabilities of Large Language Models. We demonstrate LLM-SR's effectiveness across three diverse scientific domains, where it discovers physically accurate equations.
arXiv Detail & Related papers (2024-04-29T03:30:06Z)
MMSR: Symbolic Regression is a Multimodal Task [12.660401635672967]
Symbolic regression was originally formulated as a optimization problem, and GP and reinforcement learning algorithms were used to solve it. To solve this problem, researchers treat the mapping from data to expressions as a translation problem. In this paper, we propose MMSR, which achieves the most advanced results on multiple mainstream datasets.
arXiv Detail & Related papers (2024-02-28T08:29:42Z)
Prompt Optimization via Adversarial In-Context Learning [51.18075178593142]
adv-ICL is implemented as a two-player game between a generator and a discriminator. The generator tries to generate realistic enough output to fool the discriminator. We show that adv-ICL results in significant improvements over state-of-the-art prompt optimization techniques.
arXiv Detail & Related papers (2023-12-05T09:44:45Z)
Generative error correction for code-switching speech recognition using large language models [49.06203730433107]
Code-switching (CS) speech refers to the phenomenon of mixing two or more languages within the same sentence. We propose to leverage large language models (LLMs) and lists of hypotheses generated by an ASR to address the CS problem.
arXiv Detail & Related papers (2023-10-17T14:49:48Z)
NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual Question Answering [52.10214317661547]
Current numerical reasoning methods autoregressively decode program sequences. The accuracy of program generation drops sharply as the decoding steps unfold due to error propagation. In this paper, we propose a non-autoregressive program generation framework.
arXiv Detail & Related papers (2022-11-07T11:25:21Z)
SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory. We propose StreaMRAK - a streaming version of KRR. We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z)
Efficient time stepping for numerical integration using reinforcement learning [0.15393457051344295]
We propose a data-driven time stepping scheme based on machine learning and meta-learning. First, one or several (in the case of non-smooth or hybrid systems) base learners are trained using RL. Then, a meta-learner is trained which (depending on the system state) selects the base learner that appears to be optimal for the current situation.
arXiv Detail & Related papers (2021-04-08T07:24:54Z)
Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence. Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z)
A Robust Matching Pursuit Algorithm Using Information Theoretic Learning [37.968665739578185]
A new OMP algorithm is developed based on the information theoretic learning (ITL) The experimental results on both simulated and real-world data consistently demonstrate the superiority of the proposed OMP algorithm in data recovery, image reconstruction, and classification.
arXiv Detail & Related papers (2020-05-10T01:36:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.