WARM: A Weakly (+Semi) Supervised Model for Solving Math word Problems
- URL: http://arxiv.org/abs/2104.06722v2
- Date: Tue, 13 Jun 2023 19:26:15 GMT
- Title: WARM: A Weakly (+Semi) Supervised Model for Solving Math word Problems
- Authors: Oishik Chatterjee, Isha Pandey, Aashish Waikar, Vishwajeet Kumar,
Ganesh Ramakrishnan
- Abstract summary: Solving math word problems (MWPs) is an important and challenging problem in natural language processing.
We propose a weakly supervised model for solving MWPs by requiring only the final answer as supervision.
We demonstrate that our approach achieves accuracy gains of 4.5% and 32% over the state-of-the-art weakly supervised approach.
- Score: 21.501567886241087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solving math word problems (MWPs) is an important and challenging problem in
natural language processing. Existing approaches to solve MWPs require full
supervision in the form of intermediate equations. However, labeling every MWP
with its corresponding equations is a time-consuming and expensive task. In
order to address this challenge of equation annotation, we propose a weakly
supervised model for solving MWPs by requiring only the final answer as
supervision. We approach this problem by first learning to generate the
equation using the problem description and the final answer, which we
subsequently use to train a supervised MWP solver. We propose and compare
various weakly supervised techniques to learn to generate equations directly
from the problem description and answer. Through extensive experiments, we
demonstrate that without using equations for supervision, our approach achieves
accuracy gains of 4.5% and 32% over the state-of-the-art weakly supervised
approach, on the standard Math23K and AllArith datasets respectively.
Additionally, we curate and release new datasets of roughly 10k MWPs each in
English and in Hindi (a low resource language).These datasets are suitable for
training weakly supervised models. We also present an extension of WARMM to
semi-supervised learning and present further improvements on results, along
with insights.
Related papers
- From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision [12.023661884821554]
We introduce an innovative two-stage framework that adeptly transfers mathematical Expertise from large to tiny language models.
Our method fully leverages the semantic understanding capabilities during the searching 'problem-equation' pair.
It demonstrates significantly improved performance on the Math23K and Weak12K datasets compared to existing small model methods.
arXiv Detail & Related papers (2024-03-21T13:29:54Z) - Solving Math Word Problems with Reexamination [27.80592576792461]
We propose a pseudo-dual (PseDual) learning scheme to model such process, which is model-agnostic.
The pseudo-dual task is specifically defined as filling the numbers in the expression back into the original word problem with numbers masked.
Our pseudo-dual learning scheme has been tested and proven effective when being equipped in several representative MWP solvers through empirical studies.
arXiv Detail & Related papers (2023-10-14T14:23:44Z) - Learning by Analogy: Diverse Questions Generation in Math Word Problem [21.211970350827183]
Solving math word problem (MWP) with AI techniques has recently made great progress with the success of deep neural networks (DNN)
We argue that the ability of learning by analogy is essential for an MWP solver to better understand same problems which may typically be formulated in diverse ways.
In this paper, we make a first attempt to solve MWPs by generating diverse yet consistent questions/equations.
arXiv Detail & Related papers (2023-06-15T11:47:07Z) - Leveraging Training Data in Few-Shot Prompting for Numerical Reasoning [10.889271604723312]
Chain-of-thought (CoT) prompting with large language models has proven effective in numerous natural language processing tasks.
We investigate two approaches to leverage the training data in a few-shot prompting scenario: dynamic program prompting and program distillation.
Our experiments on three standard math word problem (MWP) datasets demonstrate the effectiveness of these approaches.
arXiv Detail & Related papers (2023-05-29T16:01:40Z) - Generalizing Math Word Problem Solvers via Solution Diversification [56.2690023011738]
We design a new training framework for an MWP solver by introducing a solution buffer and a solution discriminator.
Our framework is flexibly applicable to a wide setting of fully, semi-weakly and weakly supervised training for all Seq2Seq MWP solvers.
arXiv Detail & Related papers (2022-12-01T19:34:58Z) - Unbiased Math Word Problems Benchmark for Mitigating Solving Bias [72.8677805114825]
Current solvers exist solving bias which consists of data bias and learning bias due to biased dataset and improper training strategy.
Our experiments verify MWP solvers are easy to be biased by the biased training datasets which do not cover diverse questions for each problem narrative of all MWPs.
An MWP can be naturally solved by multiple equivalent equations while current datasets take only one of the equivalent equations as ground truth.
arXiv Detail & Related papers (2022-05-17T06:07:04Z) - Generate & Rank: A Multi-task Framework for Math Word Problems [48.99880318686938]
Math word problem (MWP) is a challenging and critical task in natural language processing.
We propose Generate & Rank, a framework based on a generative pre-trained language model.
By joint training with generation and ranking, the model learns from its own mistakes and is able to distinguish between correct and incorrect expressions.
arXiv Detail & Related papers (2021-09-07T12:21:49Z) - MWP-BERT: A Strong Baseline for Math Word Problems [47.51572465676904]
Math word problem (MWP) solving is the task of transforming a sequence of natural language problem descriptions to executable math equations.
Although recent sequence modeling MWP solvers have gained credits on the math-text contextual understanding, pre-trained language models (PLM) have not been explored for solving MWP.
We introduce MWP-BERT to obtain pre-trained token representations that capture the alignment between text description and mathematical logic.
arXiv Detail & Related papers (2021-07-28T15:28:41Z) - SMART: A Situation Model for Algebra Story Problems via Attributed
Grammar [74.1315776256292]
We introduce the concept of a emphsituation model, which originates from psychology studies to represent the mental states of humans in problem-solving.
We show that the proposed model outperforms all previous neural solvers by a large margin while preserving much better interpretability.
arXiv Detail & Related papers (2020-12-27T21:03:40Z) - Learning by Fixing: Solving Math Word Problems with Weak Supervision [70.62896781438694]
Previous neural solvers of math word problems (MWPs) are learned with full supervision and fail to generate diverse solutions.
We introduce a textitweakly-supervised paradigm for learning MWPs.
Our method only requires the annotations of the final answers and can generate various solutions for a single problem.
arXiv Detail & Related papers (2020-12-19T03:10:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.