Tactic Learning and Proving for the Coq Proof Assistant
- URL: http://arxiv.org/abs/2003.09140v1
- Date: Fri, 20 Mar 2020 08:22:30 GMT
- Title: Tactic Learning and Proving for the Coq Proof Assistant
- Authors: Lasse Blaauwbroek, Josef Urban, and Herman Geuvers
- Abstract summary: Our system predicts appropriate tactics and finds proofs in the form of tactic scripts.
The performance of the system is evaluated on the Coq Standard Library.
When combined with the CoqHammer system, the two systems together prove 56.7% of the library's lemmas.
- Score: 0.5735035463793007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a system that utilizes machine learning for tactic proof search in
the Coq Proof Assistant. In a similar vein as the TacticToe project for HOL4,
our system predicts appropriate tactics and finds proofs in the form of tactic
scripts. To do this, it learns from previous tactic scripts and how they are
applied to proof states. The performance of the system is evaluated on the Coq
Standard Library. Currently, our predictor can identify the correct tactic to
be applied to a proof state 23.4% of the time. Our proof searcher can fully
automatically prove 39.3% of the lemmas. When combined with the CoqHammer
system, the two systems together prove 56.7% of the library's lemmas.
Related papers
- Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically [29.908878832382523]
This paper focuses on improving LLMs' ability to write proofs in formal languages that permit automated proof verification/evaluation.
We work in a more natural setup where the lemmas that are directly relevant to the theorem are not given to the theorem prover at test time.
We design an RL-based training algorithm that encourages the model to decompose a theorem into lemmas, prove the lemmas, and then prove the theorem by using the lemmas.
arXiv Detail & Related papers (2024-11-04T05:57:40Z) - QEDCartographer: Automating Formal Verification Using Reward-Free Reinforcement Learning [8.116854714039452]
QEDCartographer is an automated proof-synthesis tool that combines supervised and reinforcement learning.
We evaluate QEDCartographer using the CoqGym benchmark of 68.5K theorems from 124 open-source Coq projects.
Our work demonstrates that reinforcement learning is a fruitful research direction for improving proof-synthesis tools' search mechanisms.
arXiv Detail & Related papers (2024-08-17T16:06:14Z) - Lean-STaR: Learning to Interleave Thinking and Proving [53.923617816215774]
We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof.
Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment.
arXiv Detail & Related papers (2024-07-14T01:43:07Z) - CoqPyt: Proof Navigation in Python in the Era of LLMs [5.029445580644576]
This paper presents CoqPyt, a Python tool for interacting with the Coq proof assistant.
CoqPyt improves on other Coq-related tools by providing novel features, such as the extraction of rich premise data.
arXiv Detail & Related papers (2024-05-07T12:50:28Z) - Learn from Failure: Fine-Tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving [41.23045212775232]
We demonstrate the benefit of training models that additionally learn from failed search paths.
Facing the lack of such trial-and-error data in existing open-source theorem-proving datasets, we curate a dataset on intuitionistic propositional logic theorems.
We compare our model trained on relatively short trial-and-error information (TrialMaster) with models trained only on the correct paths and discover that the former solves more unseen theorems with lower trial searches.
arXiv Detail & Related papers (2024-04-10T23:01:45Z) - MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data [85.50740598523818]
MUSTARD is a framework that masters uniform synthesis of theorem and proof data of high quality and diversity.
We present a theorem-and-proof benchmark MUSTARDSAUCE with 5,866 valid data points.
We perform extensive analysis and demonstrate that MUSTARD generates validated high-quality step-by-step data.
arXiv Detail & Related papers (2024-02-14T05:57:58Z) - Progressive-Hint Prompting Improves Reasoning in Large Language Models [63.98629132836499]
This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP)
It enables automatic multiple interactions between users and Large Language Models (LLMs) by using previously generated answers as hints to progressively guide toward the correct answers.
We conducted extensive and comprehensive experiments on seven benchmarks. The results show that PHP significantly improves accuracy while remaining highly efficient.
arXiv Detail & Related papers (2023-04-19T16:29:48Z) - Generating Natural Language Proofs with Verifier-Guided Search [74.9614610172561]
We present a novel stepwise method NLProofS (Natural Language Proof Search)
NLProofS learns to generate relevant steps conditioning on the hypothesis.
It achieves state-of-the-art performance on EntailmentBank and RuleTaker.
arXiv Detail & Related papers (2022-05-25T02:22:30Z) - Online Machine Learning Techniques for Coq: A Comparison [2.9412498294532856]
We present a comparison of several online machine learning techniques for tactical learning and proving in the Coq proof assistant.
This work builds on Tactician, a plugin for Coq that learns from proofs written by the user to synthesize new proofs.
arXiv Detail & Related papers (2021-04-12T05:12:35Z) - PRover: Proof Generation for Interpretable Reasoning over Rules [81.40404921232192]
We propose a transformer-based model that answers binary questions over rule-bases and generates the corresponding proofs.
Our model learns to predict nodes and edges corresponding to proof graphs in an efficient constrained training paradigm.
We conduct experiments on synthetic, hand-authored, and human-paraphrased rule-bases to show promising results for QA and proof generation.
arXiv Detail & Related papers (2020-10-06T15:47:53Z) - Learning to Prove Theorems by Learning to Generate Theorems [71.46963489866596]
We learn a neural generator that automatically synthesizes theorems and proofs for the purpose of training a theorem prover.
Experiments on real-world tasks demonstrate that synthetic data from our approach improves the theorem prover.
arXiv Detail & Related papers (2020-02-17T16:06:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.