Related papers: ProofBuddy: A Proof Assistant for Learning and Monitoring

ProofBuddy: A Proof Assistant for Learning and Monitoring

URL: http://arxiv.org/abs/2308.06970v1
Date: Mon, 14 Aug 2023 07:08:55 GMT
Title: ProofBuddy: A Proof Assistant for Learning and Monitoring
Authors: Nadine Karsten (Technische Universit\"at Berlin), Frederik Krogsdal Jacobsen (Technical University of Denmark), Kim Jana Eiken (Technische Universit\"at Berlin), Uwe Nestmann (Technische Universit\"at Berlin), J{\o}rgen Villadsen (Technical University of Denmark)
Abstract summary: Proof competence, i.e. the ability to write and check (mathematical) proofs, is an important skill in Computer Science. The main issues are the correct use of formal language and the ascertainment of whether proofs, especially the students' own, are complete and correct. Many authors have suggested using proof assistants to assist in teaching proof competence, but the efficacy of the approach is unclear. We have performed a preliminary usability study of ProofBuddy at the Technical University of Denmark.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Proof competence, i.e. the ability to write and check (mathematical) proofs, is an important skill in Computer Science, but for many students it represents a difficult challenge. The main issues are the correct use of formal language and the ascertainment of whether proofs, especially the students' own, are complete and correct. Many authors have suggested using proof assistants to assist in teaching proof competence, but the efficacy of the approach is unclear. To improve the state of affairs, we introduce ProofBuddy: a web-based tool using the Isabelle proof assistant which enables researchers to conduct studies of the efficacy of approaches to using proof assistants in education by collecting fine-grained data about the way students interact with proof assistants. We have performed a preliminary usability study of ProofBuddy at the Technical University of Denmark.

Related papers

LeanProgress: Guiding Search for Neural Theorem Proving via Proof Progress Prediction [74.79306773878955]
We introduce LeanProgress, a method that predicts the progress in the proof. Our experiments show that LeanProgress achieves an overall prediction accuracy of 75.1%.
arXiv Detail & Related papers (2025-02-25T07:46:36Z)
Do AI assistants help students write formal specifications? A study with ChatGPT and the B-Method [41.94295877935867]
This paper investigates the role of AI assistants, specifically OpenAI's ChatGPT, in teaching formal methods to undergraduate students. We examine whether ChatGPT provides an advantage when writing B-specifications and analyse student trust in its outputs. Our findings indicate that the AI does not help students to enhance the correctness of their specifications, with low trust correlating to better outcomes.
arXiv Detail & Related papers (2025-01-20T13:00:29Z)
Lean-STaR: Learning to Interleave Thinking and Proving [53.923617816215774]
We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof. Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment.
arXiv Detail & Related papers (2024-07-14T01:43:07Z)
Autograding Mathematical Induction Proofs with Natural Language Processing [0.12289361708127876]
We present a set of training methods and models capable of autograding freeform mathematical proofs. The models are trained using proof data collected from four different proof by induction problems. We recruit human graders to grade the same proofs as the training data, and find that the best grading model is also more accurate than most human graders.
arXiv Detail & Related papers (2024-06-11T15:30:26Z)
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data [85.50740598523818]
MUSTARD is a framework that masters uniform synthesis of theorem and proof data of high quality and diversity. We present a theorem-and-proof benchmark MUSTARDSAUCE with 5,866 valid data points. We perform extensive analysis and demonstrate that MUSTARD generates validated high-quality step-by-step data.
arXiv Detail & Related papers (2024-02-14T05:57:58Z)
On Exams with the Isabelle Proof Assistant [0.0]
We present an approach for testing student learning outcomes in a course on automated reasoning using the Isabelle proof assistant. The approach allows us to test both general understanding of formal proofs in various logical proof systems and understanding of proofs in the higher-order logic of Isabelle/HOL.
arXiv Detail & Related papers (2023-03-10T11:37:09Z)
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs [30.57062828812679]
We introduce Draft, Sketch, and Prove (DSP), a method that maps informal proofs to formal proof sketches. We show that large language models are able to produce well-structured formal sketches that follow the same reasoning steps as the informal proofs.
arXiv Detail & Related papers (2022-10-21T22:37:22Z)
Generating Natural Language Proofs with Verifier-Guided Search [74.9614610172561]
We present a novel stepwise method NLProofS (Natural Language Proof Search) NLProofS learns to generate relevant steps conditioning on the hypothesis. It achieves state-of-the-art performance on EntailmentBank and RuleTaker.
arXiv Detail & Related papers (2022-05-25T02:22:30Z)
Natural Language Specifications in Proof Assistants [0.0]
It is possible to build support for natural language specifications within existing proof assistants. This paper argues that it is possible to build support for natural language specifications within existing proof assistants.
arXiv Detail & Related papers (2022-05-16T17:05:45Z)
Formal Mathematics Statement Curriculum Learning [64.45821687940946]
We show that at same compute budget, expert iteration, by which we mean proof search interleaved with learning, dramatically outperforms proof search only. We also observe that when applied to a collection of formal statements of sufficiently varied difficulty, expert iteration is capable of finding and solving a curriculum of increasingly difficult problems.
arXiv Detail & Related papers (2022-02-03T00:17:00Z)
DialFact: A Benchmark for Fact-Checking in Dialogue [56.63709206232572]
We construct DialFact, a benchmark dataset of 22,245 annotated conversational claims, paired with pieces of evidence from Wikipedia. We find that existing fact-checking models trained on non-dialogue data like FEVER fail to perform well on our task. We propose a simple yet data-efficient solution to effectively improve fact-checking performance in dialogue.
arXiv Detail & Related papers (2021-10-15T17:34:35Z)
Proof Blocks: Autogradable Scaffolding Activities for Learning to Write Proofs [0.4588028371034407]
Proof Blocks is a software tool which enables students to write proofs by dragging and dropping prewritten proof lines into the correct order. These proofs can be graded completely automatically, enabling students to receive rapid feedback on how they are doing with their proofs. When constructing a problem, the instructor specifies the dependency graph of the lines of the proof, so that any correct arrangement of the lines can receive full credit.
arXiv Detail & Related papers (2021-06-07T17:03:58Z)
Conversations with Documents. An Exploration of Document-Centered Assistance [55.60379539074692]
Document-centered assistance, for example, to help an individual quickly review a document, has seen less significant progress. We present a survey to understand the space of document-centered assistance and the capabilities people expect in this scenario. We present a set of initial machine learned models that show that (a) we can accurately detect document-centered questions, and (b) we can build reasonably accurate models for answering such questions.
arXiv Detail & Related papers (2020-01-27T17:10:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.