Related papers: 'John ate 5 apples' != 'John ate some apples': Self-Supervised Paraphrase Quality Detection for Algebraic Word Problems

'John ate 5 apples' != 'John ate some apples': Self-Supervised Paraphrase Quality Detection for Algebraic Word Problems

URL: http://arxiv.org/abs/2206.08263v1
Date: Thu, 16 Jun 2022 16:01:59 GMT
Title: 'John ate 5 apples' != 'John ate some apples': Self-Supervised Paraphrase Quality Detection for Algebraic Word Problems
Authors: Rishabh Gupta, Venktesh V, Mukesh Mohania, Vikram Goyal
Abstract summary: This paper introduces the novel task of scoring paraphrases for Algebraic Word Problems (AWP) We propose ParaQD, a self-supervised paraphrase quality detection method using novel data augmentations. Our method outperforms existing state-of-the-art self-supervised methods by up to 32% while also demonstrating impressive zero-shot performance.
Score: 5.682665111938764
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces the novel task of scoring paraphrases for Algebraic Word Problems (AWP) and presents a self-supervised method for doing so. In the current online pedagogical setting, paraphrasing these problems is helpful for academicians to generate multiple syntactically diverse questions for assessments. It also helps induce variation to ensure that the student has understood the problem instead of just memorizing it or using unfair means to solve it. The current state-of-the-art paraphrase generation models often cannot effectively paraphrase word problems, losing a critical piece of information (such as numbers or units) which renders the question unsolvable. There is a need for paraphrase scoring methods in the context of AWP to enable the training of good paraphrasers. Thus, we propose ParaQD, a self-supervised paraphrase quality detection method using novel data augmentations that can learn latent representations to separate a high-quality paraphrase of an algebraic question from a poor one by a wide margin. Through extensive experimentation, we demonstrate that our method outperforms existing state-of-the-art self-supervised methods by up to 32% while also demonstrating impressive zero-shot performance.

Related papers

Self-Questioning Language Models [51.75087358141567]
We propose an asymmetric self-play framework where a proposer is given the topic and generates a question for a solver.<n>Both the proposer and solver are trained via reinforcement learning.<n>We study this asymmetric self-play framework on three benchmarks: three-digit multiplication, algebra problems from the OMEGA benchmark, and programming problems from Codeforces.
arXiv Detail & Related papers (2025-08-05T17:51:33Z)
Context Biasing for Pronunciations-Orthography Mismatch in Automatic Speech Recognition [56.972851337263755]
We propose a method which allows corrections of substitution errors to improve the recognition accuracy of challenging words.<n>We show that with this method we get a relative improvement in biased word error rate of up to 11%, while maintaining a competitive overall word error rate.
arXiv Detail & Related papers (2025-06-23T14:42:03Z)
Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval [22.865124583257987]
We present how analogy from similarly structured questions can improve large language models' problem-solving capabilities. Specifically, we rely on the retrieval of problems with similar computational graphs to the given question to serve as exemplars in the prompt. Empirical results across six math word problem datasets demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2024-11-25T15:01:25Z)
Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that the pointwise mutual information between a context and a question is an effective gauge for language model performance. We propose two methods that use the pointwise mutual information between a document and a question as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z)
Large Language Models as Analogical Reasoners [155.9617224350088]
Chain-of-thought (CoT) prompting for language models demonstrates impressive performance across reasoning tasks. We introduce a new prompting approach, analogical prompting, designed to automatically guide the reasoning process of large language models.
arXiv Detail & Related papers (2023-10-03T00:57:26Z)
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement [50.62461749446111]
Self-Polish (SP) is a novel method that facilitates the model's reasoning by guiding it to progressively refine the given problems to be more comprehensible and solvable. SP is to all other prompting methods of answer/reasoning side like CoT, allowing for seamless integration with state-of-the-art techniques for further improvement.
arXiv Detail & Related papers (2023-05-23T19:58:30Z)
Coherence and Diversity through Noise: Self-Supervised Paraphrase Generation via Structure-Aware Denoising [5.682665111938764]
We propose SCANING, an unsupervised framework for paraphrasing via controlled noise injection. We focus on the novel task of paraphrasing algebraic word problems having practical applications in online pedagogy. We demonstrate SCANING considerably improves performance in terms of both semantic preservation and producing diverse paraphrases.
arXiv Detail & Related papers (2023-02-06T13:50:57Z)
Automatic Generation of Socratic Subquestions for Teaching Math Word Problems [16.97827669744673]
We explore the ability of large language models (LMs) in generating sequential questions for guiding math word problem-solving. On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions. Results suggest that the difficulty level of problems plays an important role in determining whether questioning improves or hinders human performance.
arXiv Detail & Related papers (2022-11-23T10:40:22Z)
Learning to Selectively Learn for Weakly-supervised Paraphrase Generation [81.65399115750054]
We propose a novel approach to generate high-quality paraphrases with weak supervision data. Specifically, we tackle the weakly-supervised paraphrase generation problem by:. obtaining abundant weakly-labeled parallel sentences via retrieval-based pseudo paraphrase expansion. We demonstrate that our approach achieves significant improvements over existing unsupervised approaches, and is even comparable in performance with supervised state-of-the-arts.
arXiv Detail & Related papers (2021-09-25T23:31:13Z)
Phrase Retrieval Learns Passage Retrieval, Too [77.57208968326422]
We study whether phrase retrieval can serve as the basis for coarse-level retrieval including passages and documents. We show that a dense phrase-retrieval system, without any retraining, already achieves better passage retrieval accuracy. We also show that phrase filtering and vector quantization can reduce the size of our index by 4-10x.
arXiv Detail & Related papers (2021-09-16T17:42:45Z)
Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation [75.1682163844354]
We address the issues of missing modalities that have arisen from the Visual Question Answer-Difference prediction task. We introduce a model, the "Big" Teacher, that takes the image/question/answer triplet as its input and outperforms the baseline.
arXiv Detail & Related papers (2021-04-13T06:41:11Z)
My Teacher Thinks The World Is Flat! Interpreting Automatic Essay Scoring Mechanism [71.34160809068996]
Recent work shows that automated scoring systems are prone to even common-sense adversarial samples. We utilize recent advances in interpretability to find the extent to which features such as coherence, content and relevance are important for automated scoring mechanisms. We also find that since the models are not semantically grounded with world-knowledge and common sense, adding false facts such as the world is flat'' actually increases the score instead of decreasing it.
arXiv Detail & Related papers (2020-12-27T06:19:20Z)
Automated Utterance Generation [5.220940151628735]
Using relevant utterances as features in question-answering has shown to improve both the precision and recall for retrieving the right answer by a conversational assistant. We propose an utterance generation system which 1) uses extractive summarization to extract important sentences from the description, 2) uses multiple paraphrasing techniques to generate a diverse set of paraphrases of the title and summary sentences, and 3) selects good candidate paraphrases with the help of a novel candidate selection algorithm.
arXiv Detail & Related papers (2020-04-07T15:35:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.