Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving
- URL: http://arxiv.org/abs/2404.03938v1
- Date: Fri, 5 Apr 2024 07:57:03 GMT
- Title: Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving
- Authors: Gulsum Yigit, Mehmet Fatih Amasyali,
- Abstract summary: This study aims to provide MWP solvers with a more diverse training set, ultimately improving their ability to solve various math problems.
We propose several methods for data augmentation by modifying the problem texts and equations, such as synonym replacement, rule-based: question replacement, and rule based: reversing question methodologies.
This study extends by introducing a new in-context learning augmentation method, employing the Llama-7b language model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Math Word Problem (MWP) solving presents a challenging task in Natural Language Processing (NLP). This study aims to provide MWP solvers with a more diverse training set, ultimately improving their ability to solve various math problems. We propose several methods for data augmentation by modifying the problem texts and equations, such as synonym replacement, rule-based: question replacement, and rule based: reversing question methodologies over two English MWP datasets. This study extends by introducing a new in-context learning augmentation method, employing the Llama-7b language model. This approach involves instruction-based prompting for rephrasing the math problem texts. Performance evaluations are conducted on 9 baseline models, revealing that augmentation methods outperform baseline models. Moreover, concatenating examples generated by various augmentation methods further improves performance.
Related papers
- Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval [22.865124583257987]
We present how analogy from similarly structured questions can improve large language models' problem-solving capabilities.
Specifically, we rely on the retrieval of problems with similar computational graphs to the given question to serve as exemplars in the prompt.
Empirical results across six math word problem datasets demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2024-11-25T15:01:25Z) - Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization [50.485788083202124]
Reinforcement Learning (RL) plays a crucial role in aligning large language models with human preferences and improving their ability to perform complex tasks.
We introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model.
Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
arXiv Detail & Related papers (2024-10-11T23:29:20Z) - SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation.
Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z) - MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time [51.5039731721706]
MindStar is a purely inference-based searching method for large language models.
It formulates reasoning tasks as searching problems and proposes two search ideas to identify the optimal reasoning paths.
It significantly enhances the reasoning abilities of open-source models, such as Llama-2-13B and Mistral-7B, and achieves comparable performance to GPT-3.5 and Grok-1.
arXiv Detail & Related papers (2024-05-25T15:07:33Z) - From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision [12.023661884821554]
We introduce an innovative two-stage framework that adeptly transfers mathematical Expertise from large to tiny language models.
Our method fully leverages the semantic understanding capabilities during the searching 'problem-equation' pair.
It demonstrates significantly improved performance on the Math23K and Weak12K datasets compared to existing small model methods.
arXiv Detail & Related papers (2024-03-21T13:29:54Z) - Solving Math Word Problems with Reexamination [27.80592576792461]
We propose a pseudo-dual (PseDual) learning scheme to model such process, which is model-agnostic.
The pseudo-dual task is specifically defined as filling the numbers in the expression back into the original word problem with numbers masked.
Our pseudo-dual learning scheme has been tested and proven effective when being equipped in several representative MWP solvers through empirical studies.
arXiv Detail & Related papers (2023-10-14T14:23:44Z) - MinT: Boosting Generalization in Mathematical Reasoning via Multi-View
Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs)
We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles.
Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z) - Math Word Problem Solving by Generating Linguistic Variants of Problem
Statements [1.742186232261139]
We propose a framework for MWP solvers based on the generation of linguistic variants of the problem text.
The approach involves solving each of the variant problems and electing the predicted expression with the majority of the votes.
We show that training on linguistic variants of problem statements and voting on candidate predictions improve the mathematical reasoning and robustness of the model.
arXiv Detail & Related papers (2023-06-24T08:27:39Z) - Leveraging Training Data in Few-Shot Prompting for Numerical Reasoning [10.889271604723312]
Chain-of-thought (CoT) prompting with large language models has proven effective in numerous natural language processing tasks.
We investigate two approaches to leverage the training data in a few-shot prompting scenario: dynamic program prompting and program distillation.
Our experiments on three standard math word problem (MWP) datasets demonstrate the effectiveness of these approaches.
arXiv Detail & Related papers (2023-05-29T16:01:40Z) - Improving Meta-learning for Low-resource Text Classification and
Generation via Memory Imitation [87.98063273826702]
We propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation.
A theoretical analysis is provided to prove the effectiveness of our method.
arXiv Detail & Related papers (2022-03-22T12:41:55Z) - Generate & Rank: A Multi-task Framework for Math Word Problems [48.99880318686938]
Math word problem (MWP) is a challenging and critical task in natural language processing.
We propose Generate & Rank, a framework based on a generative pre-trained language model.
By joint training with generation and ranking, the model learns from its own mistakes and is able to distinguish between correct and incorrect expressions.
arXiv Detail & Related papers (2021-09-07T12:21:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.