Textual Enhanced Contrastive Learning for Solving Math Word Problems
- URL: http://arxiv.org/abs/2211.16022v1
- Date: Tue, 29 Nov 2022 08:44:09 GMT
- Title: Textual Enhanced Contrastive Learning for Solving Math Word Problems
- Authors: Yibin Shen, Qianying Liu, Zhuoyuan Mao, Fei Cheng and Sadao Kurohashi
- Abstract summary: We propose a Textual Enhanced Contrastive Learning framework, which enforces the models to distinguish semantically similar examples.
We adopt a self-supervised manner strategy to enrich examples with subtle textual variance.
Experimental results show that our method achieves state-of-the-art on both widely used benchmark datasets and also exquisitely designed challenge datasets in English and Chinese.
- Score: 23.196339273292246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solving math word problems is the task that analyses the relation of
quantities and requires an accurate understanding of contextual natural
language information. Recent studies show that current models rely on shallow
heuristics to predict solutions and could be easily misled by small textual
perturbations. To address this problem, we propose a Textual Enhanced
Contrastive Learning framework, which enforces the models to distinguish
semantically similar examples while holding different mathematical logic. We
adopt a self-supervised manner strategy to enrich examples with subtle textual
variance by textual reordering or problem re-construction. We then retrieve the
hardest to differentiate samples from both equation and textual perspectives
and guide the model to learn their representations. Experimental results show
that our method achieves state-of-the-art on both widely used benchmark
datasets and also exquisitely designed challenge datasets in English and
Chinese. \footnote{Our code and data is available at
\url{https://github.com/yiyunya/Textual_CL_MWP}
Related papers
- CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Data
Limitation With Contrastive Learning [14.637303913878435]
We present a coherence-based contrastive learning model named CoCo to detect the possible MGT under low-resource scenario.
To exploit the linguistic feature, we encode coherence information in form of graph into text representation.
Experiment results on two public datasets and two self-constructed datasets prove our approach outperforms the state-of-art methods significantly.
arXiv Detail & Related papers (2022-12-20T15:26:19Z) - Revisiting the Roles of "Text" in Text Games [102.22750109468652]
This paper investigates the roles of text in the face of different reinforcement learning challenges.
We propose a simple scheme to extract relevant contextual information into an approximate state hash.
Such a lightweight plug-in achieves competitive performance with state-of-the-art text agents.
arXiv Detail & Related papers (2022-10-15T21:52:39Z) - Dynamic Prompt Learning via Policy Gradient for Semi-structured
Mathematical Reasoning [150.17907456113537]
We present Tabular Math Word Problems (TabMWP), a new dataset containing 38,431 grade-level problems that require mathematical reasoning.
We evaluate different pre-trained models on TabMWP, including the GPT-3 model in a few-shot setting.
We propose a novel approach, PromptPG, which utilizes policy gradient to learn to select in-context examples from a small amount of training data.
arXiv Detail & Related papers (2022-09-29T08:01:04Z) - Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning
for Solving Math Word Problems [14.144577791030853]
We investigate how a neural network understands patterns only from semantics.
We propose a contrastive learning approach, where the neural network perceives the divergence of patterns.
Our method greatly improves the performance in monolingual and multilingual settings.
arXiv Detail & Related papers (2021-10-16T04:03:47Z) - Toward the Understanding of Deep Text Matching Models for Information
Retrieval [72.72380690535766]
This paper aims at testing whether existing deep text matching methods satisfy some fundamental gradients in information retrieval.
Specifically, four attributions are used in our study, i.e., term frequency constraint, term discrimination constraint, length normalization constraints, and TF-length constraint.
Experimental results on LETOR 4.0 and MS Marco show that all the investigated deep text matching methods satisfy the above constraints with high probabilities in statistics.
arXiv Detail & Related papers (2021-08-16T13:33:15Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - SMART: A Situation Model for Algebra Story Problems via Attributed
Grammar [74.1315776256292]
We introduce the concept of a emphsituation model, which originates from psychology studies to represent the mental states of humans in problem-solving.
We show that the proposed model outperforms all previous neural solvers by a large margin while preserving much better interpretability.
arXiv Detail & Related papers (2020-12-27T21:03:40Z) - Geometry matters: Exploring language examples at the decision boundary [2.7249290070320034]
BERT, CNN and fasttext are susceptible to word substitutions in high difficulty examples.
On YelpReviewPolarity we observe a correlation coefficient of -0.4 between resilience to perturbations and the difficulty score.
Our approach is simple, architecture agnostic and can be used to study the fragilities of text classification models.
arXiv Detail & Related papers (2020-10-14T16:26:13Z) - How Far are We from Effective Context Modeling? An Exploratory Study on
Semantic Parsing in Context [59.13515950353125]
We present a grammar-based decoding semantic parsing and adapt typical context modeling methods on top of it.
We evaluate 13 context modeling methods on two large cross-domain datasets, and our best model achieves state-of-the-art performances.
arXiv Detail & Related papers (2020-02-03T11:28:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.