Tactic Learning and Proving for the Coq Proof Assistant
- URL: http://arxiv.org/abs/2003.09140v1
- Date: Fri, 20 Mar 2020 08:22:30 GMT
- Title: Tactic Learning and Proving for the Coq Proof Assistant
- Authors: Lasse Blaauwbroek, Josef Urban, and Herman Geuvers
- Abstract summary: Our system predicts appropriate tactics and finds proofs in the form of tactic scripts.
The performance of the system is evaluated on the Coq Standard Library.
When combined with the CoqHammer system, the two systems together prove 56.7% of the library's lemmas.
- Score: 0.5735035463793007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a system that utilizes machine learning for tactic proof search in
the Coq Proof Assistant. In a similar vein as the TacticToe project for HOL4,
our system predicts appropriate tactics and finds proofs in the form of tactic
scripts. To do this, it learns from previous tactic scripts and how they are
applied to proof states. The performance of the system is evaluated on the Coq
Standard Library. Currently, our predictor can identify the correct tactic to
be applied to a proof state 23.4% of the time. Our proof searcher can fully
automatically prove 39.3% of the lemmas. When combined with the CoqHammer
system, the two systems together prove 56.7% of the library's lemmas.
Related papers
- Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis [50.020850767257095]
We propose ProofAug, a novel theorem proving method that enjoys superior sample efficiency.
Our method is validated on the miniF2F-test benchmark using the open-source deepseek-math-7b-base model and the Isabelle proof assistant.
arXiv Detail & Related papers (2025-01-30T12:37:06Z) - Proof Recommendation System for the HOL4 Theorem Prover [0.0]
We introduce a proof recommender system for the HOL4 theorem prover.
Our tool is built upon a transformer-based model designed specifically to provide proof assistance in HOL4.
arXiv Detail & Related papers (2024-12-31T17:13:59Z) - Rango: Adaptive Retrieval-Augmented Proving for Automated Software Verification [11.115201117143929]
We present Rango, a fully automated synthesis proof tool for Coq.
Rango identifies relevant premises and also similar proofs from the current project and uses them during synthesis.
Our evaluation shows that Rango adding relevant to its context leads to a 47% increase in the number of theorems proven.
arXiv Detail & Related papers (2024-12-18T17:08:42Z) - QEDCartographer: Automating Formal Verification Using Reward-Free Reinforcement Learning [8.116854714039452]
QEDCartographer is an automated proof-synthesis tool that combines supervised and reinforcement learning.
We evaluate QEDCartographer using the CoqGym benchmark of 68.5K theorems from 124 open-source Coq projects.
Our work demonstrates that reinforcement learning is a fruitful research direction for improving proof-synthesis tools' search mechanisms.
arXiv Detail & Related papers (2024-08-17T16:06:14Z) - Lean-STaR: Learning to Interleave Thinking and Proving [53.923617816215774]
We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof.
Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment.
arXiv Detail & Related papers (2024-07-14T01:43:07Z) - CoqPyt: Proof Navigation in Python in the Era of LLMs [5.029445580644576]
This paper presents CoqPyt, a Python tool for interacting with the Coq proof assistant.
CoqPyt improves on other Coq-related tools by providing novel features, such as the extraction of rich premise data.
arXiv Detail & Related papers (2024-05-07T12:50:28Z) - MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data [85.50740598523818]
MUSTARD is a framework that masters uniform synthesis of theorem and proof data of high quality and diversity.
We present a theorem-and-proof benchmark MUSTARDSAUCE with 5,866 valid data points.
We perform extensive analysis and demonstrate that MUSTARD generates validated high-quality step-by-step data.
arXiv Detail & Related papers (2024-02-14T05:57:58Z) - Progressive-Hint Prompting Improves Reasoning in Large Language Models [63.98629132836499]
This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP)
It enables automatic multiple interactions between users and Large Language Models (LLMs) by using previously generated answers as hints to progressively guide toward the correct answers.
We conducted extensive and comprehensive experiments on seven benchmarks. The results show that PHP significantly improves accuracy while remaining highly efficient.
arXiv Detail & Related papers (2023-04-19T16:29:48Z) - Generating Natural Language Proofs with Verifier-Guided Search [74.9614610172561]
We present a novel stepwise method NLProofS (Natural Language Proof Search)
NLProofS learns to generate relevant steps conditioning on the hypothesis.
It achieves state-of-the-art performance on EntailmentBank and RuleTaker.
arXiv Detail & Related papers (2022-05-25T02:22:30Z) - Online Machine Learning Techniques for Coq: A Comparison [2.9412498294532856]
We present a comparison of several online machine learning techniques for tactical learning and proving in the Coq proof assistant.
This work builds on Tactician, a plugin for Coq that learns from proofs written by the user to synthesize new proofs.
arXiv Detail & Related papers (2021-04-12T05:12:35Z) - PRover: Proof Generation for Interpretable Reasoning over Rules [81.40404921232192]
We propose a transformer-based model that answers binary questions over rule-bases and generates the corresponding proofs.
Our model learns to predict nodes and edges corresponding to proof graphs in an efficient constrained training paradigm.
We conduct experiments on synthetic, hand-authored, and human-paraphrased rule-bases to show promising results for QA and proof generation.
arXiv Detail & Related papers (2020-10-06T15:47:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.