Related papers: Automatic Short Math Answer Grading via In-context Meta-learning

Automatic Short Math Answer Grading via In-context Meta-learning

URL: http://arxiv.org/abs/2205.15219v1
Date: Mon, 30 May 2022 16:26:02 GMT
Title: Automatic Short Math Answer Grading via In-context Meta-learning
Authors: Mengxue Zhang, Sami Baral, Neil Heffernan, Andrew Lan
Abstract summary: We study the problem of automatic short answer grading for students' responses to math questions. We use MathBERT, a variant of the popular language model BERT adapted to mathematical content, as our base model. Second, we use an in-context learning approach that provides scoring examples as input to the language model.
Score: 2.0263791972068628
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automatic short answer grading is an important research direction in the exploration of how to use artificial intelligence (AI)-based tools to improve education. Current state-of-the-art approaches use neural language models to create vectorized representations of students responses, followed by classifiers to predict the score. However, these approaches have several key limitations, including i) they use pre-trained language models that are not well-adapted to educational subject domains and/or student-generated text and ii) they almost always train one model per question, ignoring the linkage across a question and result in a significant model storage problem due to the size of advanced language models. In this paper, we study the problem of automatic short answer grading for students' responses to math questions and propose a novel framework for this task. First, we use MathBERT, a variant of the popular language model BERT adapted to mathematical content, as our base model and fine-tune it for the downstream task of student response grading. Second, we use an in-context learning approach that provides scoring examples as input to the language model to provide additional context information and promote generalization to previously unseen questions. We evaluate our framework on a real-world dataset of student responses to open-ended math questions and show that our framework (often significantly) outperforms existing approaches, especially for new questions that are not seen during training.

Related papers

Self-Questioning Language Models [51.75087358141567]
We propose an asymmetric self-play framework where a proposer is given the topic and generates a question for a solver.<n>Both the proposer and solver are trained via reinforcement learning.<n>We study this asymmetric self-play framework on three benchmarks: three-digit multiplication, algebra problems from the OMEGA benchmark, and programming problems from Codeforces.
arXiv Detail & Related papers (2025-08-05T17:51:33Z)
Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning. We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads. We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z)
Admitting Ignorance Helps the Video Question Answering Models to Answer [82.22149677979189]
We argue that models often establish shortcuts, resulting in spurious correlations between questions and answers. We propose a novel training framework in which the model is compelled to acknowledge its ignorance when presented with an intervened question. In practice, we integrate a state-of-the-art model into our framework to validate its effectiveness.
arXiv Detail & Related papers (2025-01-15T12:44:52Z)
Teaching Smaller Language Models To Generalise To Unseen Compositional Questions (Full Thesis) [0.0]
We train our models to answer diverse questions by instilling an ability to reason over a retrieved context. We acquire context from two knowledge sources; a Wikipedia corpus queried using a multi-hop dense retrieval system with novel extensions, and from rationales generated from a larger Language Model optimised to run in a lower resource environment.
arXiv Detail & Related papers (2024-11-25T23:25:34Z)
Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books. Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z)
Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
Autonomous Data Selection with Language Models for Mathematical Texts [13.789739307267952]
We introduce a novel strategy that leverages base language models for autonomous data selection. Our approach utilizes meta-prompted language models as zero-shot verifiers to evaluate and select high-quality mathematical content autonomously. Our method showcases a 2 times increase in pretraining token efficiency compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-12T13:09:21Z)
Answer Candidate Type Selection: Text-to-Text Language Model for Closed Book Question Answering Meets Knowledge Graphs [62.20354845651949]
We present a novel approach which works on top of the pre-trained Text-to-Text QA system to address this issue. Our simple yet effective method performs filtering and re-ranking of generated candidates based on their types derived from Wikidata "instance_of" property.
arXiv Detail & Related papers (2023-10-10T20:49:43Z)
Automating question generation from educational text [1.9325905076281444]
The use of question-based activities (QBAs) is wide-spread in education, forming an integral part of the learning and assessment process. We design and evaluate an automated question generation tool for formative and summative assessment in schools.
arXiv Detail & Related papers (2023-09-26T15:18:44Z)
Promoting Open-domain Dialogue Generation through Learning Pattern Information between Contexts and Responses [5.936682548344234]
This paper improves the quality of generated responses by learning the implicit pattern information between contexts and responses in the training samples. We also design a response-aware mechanism for mining the implicit pattern information between contexts and responses so that the generated replies are more diverse and approximate to human replies.
arXiv Detail & Related papers (2023-09-06T08:11:39Z)
Generating Usage-related Questions for Preference Elicitation in Conversational Recommender Systems [19.950705852361565]
We propose a novel approach to preference elicitation by asking implicit questions based on item usage. We develop a high-quality labeled training dataset using crowdsourcing. We show that our approaches are effective in generating elicitation questions, even with limited training data.
arXiv Detail & Related papers (2021-11-26T12:23:14Z)
AES Systems Are Both Overstable And Oversensitive: Explaining Why And Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models. Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models. We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z)
Cooperative Learning of Zero-Shot Machine Reading Comprehension [9.868221447090855]
We propose a cooperative, self-play learning model for question generation and answering. We can train question generation and answering models on any textual corpora without annotation. Our model outperforms the state-of-the-art pretrained language models on standard question answering benchmarks.
arXiv Detail & Related papers (2021-03-12T18:22:28Z)
SMART: A Situation Model for Algebra Story Problems via Attributed Grammar [74.1315776256292]
We introduce the concept of a emphsituation model, which originates from psychology studies to represent the mental states of humans in problem-solving. We show that the proposed model outperforms all previous neural solvers by a large margin while preserving much better interpretability.
arXiv Detail & Related papers (2020-12-27T21:03:40Z)
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation. We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.