Related papers: Learning by Fixing: Solving Math Word Problems with Weak Supervision

Learning by Fixing: Solving Math Word Problems with Weak Supervision

URL: http://arxiv.org/abs/2012.10582v1
Date: Sat, 19 Dec 2020 03:10:21 GMT
Title: Learning by Fixing: Solving Math Word Problems with Weak Supervision
Authors: Yining Hong, Qing Li, Daniel Ciao, Siyuan Haung, Song-Chun Zhu
Abstract summary: Previous neural solvers of math word problems (MWPs) are learned with full supervision and fail to generate diverse solutions. We introduce a textitweakly-supervised paradigm for learning MWPs. Our method only requires the annotations of the final answers and can generate various solutions for a single problem.
Score: 70.62896781438694
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Previous neural solvers of math word problems (MWPs) are learned with full supervision and fail to generate diverse solutions. In this paper, we address this issue by introducing a \textit{weakly-supervised} paradigm for learning MWPs. Our method only requires the annotations of the final answers and can generate various solutions for a single problem. To boost weakly-supervised learning, we propose a novel \textit{learning-by-fixing} (LBF) framework, which corrects the misperceptions of the neural network via symbolic reasoning. Specifically, for an incorrect solution tree generated by the neural network, the \textit{fixing} mechanism propagates the error from the root node to the leaf nodes and infers the most probable fix that can be executed to get the desired answer. To generate more diverse solutions, \textit{tree regularization} is applied to guide the efficient shrinkage and exploration of the solution space, and a \textit{memory buffer} is designed to track and save the discovered various fixes for each problem. Experimental results on the Math23K dataset show the proposed LBF framework significantly outperforms reinforcement learning baselines in weakly-supervised learning. Furthermore, it achieves comparable top-1 and much better top-3/5 answer accuracies than fully-supervised methods, demonstrating its strength in producing diverse solutions.

Related papers

Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution [59.39066657300045]
Chain-of-Thought (CoT) prompting enhances the reasoning of large language models (LLMs) by decomposing problems into sequential steps. We propose Syzygy of Thoughts (SoT)-a novel framework that extends CoT by introducing auxiliary, interrelated reasoning paths. SoT captures deeper logical dependencies, enabling more robust and structured problem-solving.
arXiv Detail & Related papers (2025-04-13T13:35:41Z)
RL-MILP Solver: A Reinforcement Learning Approach for Solving Mixed-Integer Linear Programs with Graph Neural Networks [3.3894236476098185]
Mixed-integer linear programming (MILP) is a widely used optimization technique across various fields. We propose a novel reinforcement learning (RL)-based solver that not only finds the first feasible solution but also incrementally discovers better feasible solutions.
arXiv Detail & Related papers (2024-11-29T07:23:34Z)
An Unsupervised Learning Framework Combined with Heuristics for the Maximum Minimal Cut Problem [5.092968949752308]
This work proposes an unsupervised learning framework combined with Maximums for MMCP. A crucial observation is that each solution corresponds to at least one spanning tree. We conduct extensive experiments to evaluate our framework and give a specific application.
arXiv Detail & Related papers (2024-08-16T02:07:34Z)
LinSATNet: The Positive Linear Satisfiability Neural Networks [116.65291739666303]
This paper studies how to introduce the popular positive linear satisfiability to neural networks. We propose the first differentiable satisfiability layer based on an extension of the classic Sinkhorn algorithm for jointly encoding multiple sets of marginal distributions.
arXiv Detail & Related papers (2024-07-18T22:05:21Z)
Flow of Reasoning:Training LLMs for Divergent Problem Solving with Minimal Examples [12.48027669682156]
Flow of Reasoning (FoR) aims to improve reasoning quality and diversity with minimal data. FoR formulates multi-step LLM reasoning as a Markovian flow on a DAG-structured reasoning graph. Experiments show that, with limited training examples, FoR enables the discovery of diverse, creative, high-quality solutions.
arXiv Detail & Related papers (2024-06-09T07:06:58Z)
Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs [2.3020018305241337]
Distilling explicit chain-of-thought reasoning paths has emerged as an effective method for improving the reasoning abilities of large language models. We propose a novel approach to distill reasoning abilities from LLMs by leveraging their capacity to explain solutions. Our experiments demonstrate that learning from explanations enables the Reasoner to more effectively guide program implementation by a Coder.
arXiv Detail & Related papers (2024-04-11T22:19:50Z)
Automated Answer Validation using Text Similarity [0.5025737475817937]
Information retrieval methods outperform neural methods, especially in the multiple choice version of this problem. We implement Siamese neural network models and produce a generalised solution to this problem.
arXiv Detail & Related papers (2024-01-13T07:13:08Z)
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods. Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z)
Generalizing Math Word Problem Solvers via Solution Diversification [56.2690023011738]
We design a new training framework for an MWP solver by introducing a solution buffer and a solution discriminator. Our framework is flexibly applicable to a wide setting of fully, semi-weakly and weakly supervised training for all Seq2Seq MWP solvers.
arXiv Detail & Related papers (2022-12-01T19:34:58Z)
A Mutual Information Maximization Approach for the Spurious Solution Problem in Weakly Supervised Question Answering [60.768146126094955]
Weakly supervised question answering usually has only the final answers as supervision signals. There may exist many spurious solutions that coincidentally derive the correct answer, but training on such solutions can hurt model performance. We propose to explicitly exploit such semantic correlations by maximizing the mutual information between question-answer pairs and predicted solutions.
arXiv Detail & Related papers (2021-06-14T05:47:41Z)
Reversible Action Design for Combinatorial Optimization with Reinforcement Learning [35.50454156611722]
Reinforcement learning (RL) has recently emerged as a new framework to tackle these problems. We propose a general RL framework that not only exhibits state-of-the-art empirical performance but also generalizes to a variety class of COPs.
arXiv Detail & Related papers (2021-02-14T18:05:42Z)
SMART: A Situation Model for Algebra Story Problems via Attributed Grammar [74.1315776256292]
We introduce the concept of a emphsituation model, which originates from psychology studies to represent the mental states of humans in problem-solving. We show that the proposed model outperforms all previous neural solvers by a large margin while preserving much better interpretability.
arXiv Detail & Related papers (2020-12-27T21:03:40Z)
Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces [20.101005623256626]
We argue that being oblivious to the presence of multiple solutions can severely hamper their training ability. We present a generic learning framework that adapts an existing prediction network for an RL problem to handle solution multiplicity.
arXiv Detail & Related papers (2020-08-27T08:37:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.