Related papers: Towards Automatic Transformations of Coq Proof Scripts

Towards Automatic Transformations of Coq Proof Scripts

URL: http://arxiv.org/abs/2401.11897v1
Date: Mon, 22 Jan 2024 12:48:34 GMT
Title: Towards Automatic Transformations of Coq Proof Scripts
Authors: Nicolas Magaud (Lab. ICube CNRS Universit\'e de Strasbourg, France)
Abstract summary: We present a framework to carry out a posteriori script transformations. These transformations are meant to be applied as an automated post-processing step, once the proof has been completed. We apply our tool to various Coq proof scripts, including some from the GeoCoq library.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Proof assistants like Coq are increasingly popular to help mathematicians carry out proofs of the results they conjecture. However, formal proofs remain highly technical and are especially difficult to reuse. In this paper, we present a framework to carry out a posteriori script transformations. These transformations are meant to be applied as an automated post-processing step, once the proof has been completed. As an example, we present a transformation which takes an arbitrary large proof script and produces an equivalent single-line proof script, which can be executed by Coq in one single step. Other applications, such as fully expanding a proof script (for debugging purposes), removing all named hypotheses, etc. could be developed within this framework. We apply our tool to various Coq proof scripts, including some from the GeoCoq library.

Related papers

LeanProgress: Guiding Search for Neural Theorem Proving via Proof Progress Prediction [74.79306773878955]
We introduce LeanProgress, a method that predicts the progress in the proof. Our experiments show that LeanProgress achieves an overall prediction accuracy of 75.1%.
arXiv Detail & Related papers (2025-02-25T07:46:36Z)
Cobblestone: Iterative Automation for Formal Verification [11.445689801392657]
Formal verification using proof assistants, such as Coq, is an effective way of improving software quality, but it is expensive. Recent research has used machine learning to automatically synthesize proofs, reducing verification effort, but these tools are able to prove only a fraction of the desired software properties. We introduce Cobblestone, a new proof-synthesis approach that improves on the state of the art by taking advantage of partial progress in proof synthesis attempts.
arXiv Detail & Related papers (2024-10-25T19:25:00Z)
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs [80.96119560172224]
Large language models (LLMs) can solve arithmetic word problems with high accuracy, but little is known about how well they generalize to problems that are more complex than the ones on which they have been trained. We present a framework for evaluating LLMs on problems with arbitrarily complex arithmetic proofs, called MathGAP.
arXiv Detail & Related papers (2024-10-17T12:48:14Z)
CoqPyt: Proof Navigation in Python in the Era of LLMs [5.029445580644576]
This paper presents CoqPyt, a Python tool for interacting with the Coq proof assistant. CoqPyt improves on other Coq-related tools by providing novel features, such as the extraction of rich premise data.
arXiv Detail & Related papers (2024-05-07T12:50:28Z)
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data [85.50740598523818]
MUSTARD is a framework that masters uniform synthesis of theorem and proof data of high quality and diversity. We present a theorem-and-proof benchmark MUSTARDSAUCE with 5,866 valid data points. We perform extensive analysis and demonstrate that MUSTARD generates validated high-quality step-by-step data.
arXiv Detail & Related papers (2024-02-14T05:57:58Z)
Baldur: Whole-Proof Generation and Repair with Large Language Models [8.100054850290507]
We use large language models, trained on natural language text and code and fine-tuned on proofs, to generate whole proofs for theorems at once. We combine this proof generation model with a fine-tuned repair model to repair generated proofs, further increasing proving power. We evaluate our method in a prototype, Baldur, and evaluate it on a benchmark of 6,336 Isabelle/HOL theorems and their proofs.
arXiv Detail & Related papers (2023-03-08T22:00:15Z)
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs [30.57062828812679]
We introduce Draft, Sketch, and Prove (DSP), a method that maps informal proofs to formal proof sketches. We show that large language models are able to produce well-structured formal sketches that follow the same reasoning steps as the informal proofs.
arXiv Detail & Related papers (2022-10-21T22:37:22Z)
Generating Natural Language Proofs with Verifier-Guided Search [74.9614610172561]
We present a novel stepwise method NLProofS (Natural Language Proof Search) NLProofS learns to generate relevant steps conditioning on the hypothesis. It achieves state-of-the-art performance on EntailmentBank and RuleTaker.
arXiv Detail & Related papers (2022-05-25T02:22:30Z)
multiPRover: Generating Multiple Proofs for Improved Interpretability in Rule Reasoning [73.09791959325204]
We focus on a type of linguistic formal reasoning where the goal is to reason over explicit knowledge in the form of natural language facts and rules. A recent work, named PRover, performs such reasoning by answering a question and also generating a proof graph that explains the answer. In our work, we address a new and challenging problem of generating multiple proof graphs for reasoning over natural language rule-bases.
arXiv Detail & Related papers (2021-06-02T17:58:35Z)
Zero-shot Fact Verification by Claim Generation [85.27523983027471]
We develop QACG, a framework for training a robust fact verification model. We use automatically generated claims that can be supported, refuted, or unverifiable from evidence from Wikipedia. In a zero-shot scenario, QACG improves a RoBERTa model's F1 from 50% to 77%, equivalent in performance to 2K+ manually-curated examples.
arXiv Detail & Related papers (2021-05-31T03:13:52Z)
The Coq Proof Script Visualiser (coq-psv) [0.0]
This tool supports both education and review processes as all proof steps can be visualised. In contrast to the usual approach of visualising proofs as hypertext or markdown documents, the generated files can be easily printed.
arXiv Detail & Related papers (2021-01-15T08:52:31Z)
PRover: Proof Generation for Interpretable Reasoning over Rules [81.40404921232192]
We propose a transformer-based model that answers binary questions over rule-bases and generates the corresponding proofs. Our model learns to predict nodes and edges corresponding to proof graphs in an efficient constrained training paradigm. We conduct experiments on synthetic, hand-authored, and human-paraphrased rule-bases to show promising results for QA and proof generation.
arXiv Detail & Related papers (2020-10-06T15:47:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.