Exploring Equation as a Better Intermediate Meaning Representation for
Numerical Reasoning
- URL: http://arxiv.org/abs/2308.10585v1
- Date: Mon, 21 Aug 2023 09:35:33 GMT
- Title: Exploring Equation as a Better Intermediate Meaning Representation for
Numerical Reasoning
- Authors: Dingzirui Wang, Longxu Dou, Wenbin Zhang, Junyu Zeng, Wanxiang Che
- Abstract summary: We use equations as IMRs to solve the numerical reasoning task.
We present a method called Boosting Numerical Reasontextbfing by Decomposing the Generation of Equations (Bridge)
Our method improves the performance by 2.2%, 0.9%, and 1.7% on GSM8K, SVAMP, and Algebra datasets.
- Score: 53.2491163874712
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Numerical reasoning is vital for natural language processing models to
understand and process numerical information in real-world scenarios. Most
current methods first generate the Intermediate Meaning Representations (IMRs)
of questions and then generate answers. Current SOTA methods generate programs
as IMRs with large language models (LLMs). Intuitively, equations have fewer
restrictions and closer semantics to the question than programs, leading to
higher generation accuracy. However, current LLMs generate equations worse than
programs, where we assume that the equation data is rare in pre-training data
compared to programs. So in this paper, we try to use equations as IMRs to
solve the numerical reasoning task by addressing two problems: (1)
Theoretically, how to prove that the equation is an IMR with higher generation
accuracy than programs; (2) Empirically, how to improve the generation accuracy
of equations with LLMs. For the first problem, we propose and prove a
proposition to theoretically compare the generation accuracy of different IMRs.
For the second problem, we present a method called Boosting Numerical
Reason\textbfing by Decomposing the Generation of Equations (Bridge), which can
improve the accuracy of LLMs in generating equations as IMRs by reducing the
tendency of generating constant expressions and programs. Our method improves
the performance by 2.2%, 0.9%, and 1.7% on GSM8K, SVAMP, and Algebra datasets
compared to the previous state-of-the-art methods under the single reasoning
path setting. Our codes and prompts are released in
https://github.com/zirui-HIT/Bridge_for_Numerical_Reasoning.
Related papers
- LLM4ED: Large Language Models for Automatic Equation Discovery [0.8644909837301149]
We introduce a new framework that utilizes natural language-based prompts to guide large language models in automatically mining governing equations from data.
Specifically, we first utilize the generation capability of LLMs to generate diverse equations in string form, and then evaluate the generated equations based on observations.
Experiments are extensively conducted on both partial differential equations and ordinary differential equations.
arXiv Detail & Related papers (2024-05-13T14:03:49Z) - LLM-SR: Scientific Equation Discovery via Programming with Large Language Models [17.64574496035502]
Traditional methods of equation discovery, known as symbolic regression, largely focus on extracting equations from data alone.
We introduce LLM-SR, a novel approach that leverages the scientific knowledge and robust code generation capabilities of Large Language Models.
We demonstrate LLM-SR's effectiveness across three diverse scientific domains, where it discovers physically accurate equations.
arXiv Detail & Related papers (2024-04-29T03:30:06Z) - MMSR: Symbolic Regression is a Multimodal Task [12.660401635672967]
Symbolic regression was originally formulated as a optimization problem, and GP and reinforcement learning algorithms were used to solve it.
To solve this problem, researchers treat the mapping from data to expressions as a translation problem.
In this paper, we propose MMSR, which achieves the most advanced results on multiple mainstream datasets.
arXiv Detail & Related papers (2024-02-28T08:29:42Z) - Prompt Optimization via Adversarial In-Context Learning [51.18075178593142]
adv-ICL is implemented as a two-player game between a generator and a discriminator.
The generator tries to generate realistic enough output to fool the discriminator.
We show that adv-ICL results in significant improvements over state-of-the-art prompt optimization techniques.
arXiv Detail & Related papers (2023-12-05T09:44:45Z) - Generative error correction for code-switching speech recognition using
large language models [49.06203730433107]
Code-switching (CS) speech refers to the phenomenon of mixing two or more languages within the same sentence.
We propose to leverage large language models (LLMs) and lists of hypotheses generated by an ASR to address the CS problem.
arXiv Detail & Related papers (2023-10-17T14:49:48Z) - NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual
Question Answering [52.10214317661547]
Current numerical reasoning methods autoregressively decode program sequences.
The accuracy of program generation drops sharply as the decoding steps unfold due to error propagation.
In this paper, we propose a non-autoregressive program generation framework.
arXiv Detail & Related papers (2022-11-07T11:25:21Z) - SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory.
We propose StreaMRAK - a streaming version of KRR.
We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z) - Efficient time stepping for numerical integration using reinforcement
learning [0.15393457051344295]
We propose a data-driven time stepping scheme based on machine learning and meta-learning.
First, one or several (in the case of non-smooth or hybrid systems) base learners are trained using RL.
Then, a meta-learner is trained which (depending on the system state) selects the base learner that appears to be optimal for the current situation.
arXiv Detail & Related papers (2021-04-08T07:24:54Z) - Recognizing and Verifying Mathematical Equations using Multiplicative
Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence.
Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z) - A Robust Matching Pursuit Algorithm Using Information Theoretic Learning [37.968665739578185]
A new OMP algorithm is developed based on the information theoretic learning (ITL)
The experimental results on both simulated and real-world data consistently demonstrate the superiority of the proposed OMP algorithm in data recovery, image reconstruction, and classification.
arXiv Detail & Related papers (2020-05-10T01:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.