Related papers: Natural Language Proof Checking in Introduction to Proof Classes -- First Experiences with Diproche

Natural Language Proof Checking in Introduction to Proof Classes -- First Experiences with Diproche

URL: http://arxiv.org/abs/2202.08131v1
Date: Tue, 8 Feb 2022 00:07:57 GMT
Title: Natural Language Proof Checking in Introduction to Proof Classes -- First Experiences with Diproche
Authors: Merlin Carl (Europa-Universit\"at Flensburg), Hinrich Lorenzen (Europa-Universit\"at Flensburg), Michael Schmitz (Europa-Universit\"at Flensburg)
Abstract summary: We present and analyze the employment of the Diproche system in a one-semester mathematics beginners lecture. The system is used to check the students' solution attempts to proving exercises in Boolean set theory and elementary number theory.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present and analyze the employment of the Diproche system, a natural language proof checker, within a one-semester mathematics beginners lecture with 228 participants. The system is used to check the students' solution attempts to proving exercises in Boolean set theory and elementary number theory and to give them immediate feedback. The benefits of the employment of the system are assessed via a questionnaire at the end of the semester and via analyzing the solution attempts of a subgroup of the students. Based on our results we develop approaches for future improvements.

Related papers

NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI Tutors [0.12499537119440242]
This paper presents our system for Track 1, Mistake Identification in the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors.<n>The task involves evaluating whether a tutor's response correctly identifies a mistake in a student's reasoning.<n>Our system retrieves semantically similar examples, constructs structured prompts, and uses schema-guided parsing produceable predictions.
arXiv Detail & Related papers (2025-06-12T12:11:56Z)
Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems [0.0]
This study focuses on improving the performance of lightweight Large Language Models (LLMs) in mathematical reasoning tasks. We introduce a novel method for measuring mathematical logic similarity and design an automatic screening mechanism. By employing carefully crafted positive and negative example prompts, we guide the model towards adopting sound reasoning logic.
arXiv Detail & Related papers (2024-08-29T08:26:42Z)
Lean-STaR: Learning to Interleave Thinking and Proving [53.923617816215774]
We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof. Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment.
arXiv Detail & Related papers (2024-07-14T01:43:07Z)
LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback [71.95402654982095]
We propose Math-Minos, a natural language feedback-enhanced verifier. Our experiments reveal that a small set of natural language feedback can significantly boost the performance of the verifier.
arXiv Detail & Related papers (2024-06-20T06:42:27Z)
The potential of large language models for improving probability learning: A study on ChatGPT3.5 and first-year computer engineering students [0.565395466029518]
ChatGPT is a large-scale language model that can solve probability problems. ChatGPT is used in solving probability problems typically presented in computer engineering exams. The model's ability to deliver high-quality explanations and illustrate solutions in any programming language suggests that large language models have the potential to serve as learning assistants.
arXiv Detail & Related papers (2023-10-09T12:54:58Z)
On Exams with the Isabelle Proof Assistant [0.0]
We present an approach for testing student learning outcomes in a course on automated reasoning using the Isabelle proof assistant. The approach allows us to test both general understanding of formal proofs in various logical proof systems and understanding of proofs in the higher-order logic of Isabelle/HOL.
arXiv Detail & Related papers (2023-03-10T11:37:09Z)
Lila: A Unified Benchmark for Mathematical Reasoning [59.97570380432861]
LILA is a unified mathematical reasoning benchmark consisting of 23 diverse tasks along four dimensions. We construct our benchmark by extending 20 datasets benchmark by collecting task instructions and solutions in the form of Python programs. We introduce BHASKARA, a general-purpose mathematical reasoning model trained on LILA.
arXiv Detail & Related papers (2022-10-31T17:41:26Z)
JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding [74.12405417718054]
This paper aims to advance the mathematical intelligence of machines by presenting the first Chinese mathematical pre-trained language model(PLM) Unlike other standard NLP tasks, mathematical texts are difficult to understand, since they involve mathematical terminology, symbols and formulas in the problem statement. We design a novel curriculum pre-training approach for improving the learning of mathematical PLMs, consisting of both basic and advanced courses.
arXiv Detail & Related papers (2022-06-13T17:03:52Z)
ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification. A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors. Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
AR-LSAT: Investigating Analytical Reasoning of Text [57.1542673852013]
We study the challenge of analytical reasoning of text and introduce a new dataset consisting of questions from the Law School Admission Test from 1991 to 2016. We analyze what knowledge understanding and reasoning abilities are required to do well on this task.
arXiv Detail & Related papers (2021-04-14T02:53:32Z)
Effective Feedback for Introductory CS Theory: A JFLAP Extension and Student Persistence [4.40401067183266]
The main goal of our research is to help students learn abstract computational models. The most common pedagogical tool for interacting with these models is the Java Formal Languages and Automata Package (JFLAP) We developed a JFLAP server extension, which accepts homework submissions from students, evaluates the submission as correct or incorrect, and provides a witness string when the submission is incorrect.
arXiv Detail & Related papers (2020-12-02T21:39:01Z)
Using Automated Theorem Provers for Mistake Diagnosis in the Didactics of Mathematics [0.0]
The Diproche system is specifically adapted to the context of exercises for beginner's students similar to the Naproche system by Koepke, Schr"oder, Cramer and others. We briefly describe the concept of such an Anti-ATP' and explain the basic techniques used in its implementation.
arXiv Detail & Related papers (2020-02-12T16:36:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.