TacticZero: Learning to Prove Theorems from Scratch with Deep
Reinforcement Learning
- URL: http://arxiv.org/abs/2102.09756v1
- Date: Fri, 19 Feb 2021 06:08:39 GMT
- Title: TacticZero: Learning to Prove Theorems from Scratch with Deep
Reinforcement Learning
- Authors: Minchao Wu, Michael Norrish, Christian Walder, Amir Dezfouli
- Abstract summary: We propose a novel approach to interactive theorem-proving (ITP) using deep reinforcement learning.
We formulate the process of ITP as a Markov decision process (MDP) in which each state represents a set of potential derivation paths.
We show that the framework provides comparable performance to that of the approaches that use human experts.
- Score: 6.764610878007278
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel approach to interactive theorem-proving (ITP) using deep
reinforcement learning. Unlike previous work, our framework is able to prove
theorems both end-to-end and from scratch (i.e., without relying on example
proofs from human experts). We formulate the process of ITP as a Markov
decision process (MDP) in which each state represents a set of potential
derivation paths. The agent learns to select promising derivations as well as
appropriate tactics within each derivation using deep policy gradients. This
structure allows us to introduce a novel backtracking mechanism which enables
the agent to efficiently discard (predicted) dead-end derivations and restart
the derivation from promising alternatives. Experimental results show that the
framework provides comparable performance to that of the approaches that use
human experts, and that it is also capable of proving theorems that it has
never seen during training. We further elaborate the role of each component of
the framework using ablation studies.
Related papers
- Learning Rules Explaining Interactive Theorem Proving Tactic Prediction [5.229806149125529]
We represent the problem as an Inductive Logic Programming (ILP) task.
Using the ILP representation we enriched the feature space by encoding additional, computationally expensive properties.
We use this enriched feature space to learn rules explaining when a tactic is applicable to a given proof state.
arXiv Detail & Related papers (2024-11-02T09:18:33Z) - Lean-STaR: Learning to Interleave Thinking and Proving [53.923617816215774]
We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof.
Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment.
arXiv Detail & Related papers (2024-07-14T01:43:07Z) - Proving Theorems Recursively [80.42431358105482]
We propose POETRY, which proves theorems in a level-by-level manner.
Unlike previous step-by-step methods, POETRY searches for a sketch of the proof at each level.
We observe a substantial increase in the maximum proof length found by POETRY, from 10 to 26.
arXiv Detail & Related papers (2024-05-23T10:35:08Z) - A Survey on Deep Learning for Theorem Proving [16.28502772608166]
Theorem proving is a fundamental aspect of mathematics, spanning from informal reasoning in natural language to rigorous derivations in formal systems.
Deep learning, especially the emergence of large language models, has sparked a notable surge of research exploring these techniques to enhance the process of theorem proving.
arXiv Detail & Related papers (2024-04-15T17:07:55Z) - Automated Completion of Statements and Proofs in Synthetic Geometry: an
Approach based on Constraint Solving [0.0]
We propose a framework for completing incomplete conjectures and incomplete proofs.
The framework can turn a conjecture with missing assumptions into a proper theorem.
Also, the proposed framework can help in completing a proof sketch into a human-readable and machine-checkable proof.
arXiv Detail & Related papers (2024-01-22T12:49:08Z) - Towards a General Framework for Continual Learning with Pre-training [55.88910947643436]
We present a general framework for continual learning of sequentially arrived tasks with the use of pre-training.
We decompose its objective into three hierarchical components, including within-task prediction, task-identity inference, and task-adaptive prediction.
We propose an innovative approach to explicitly optimize these components with parameter-efficient fine-tuning (PEFT) techniques and representation statistics.
arXiv Detail & Related papers (2023-10-21T02:03:38Z) - Hierarchical Decomposition of Prompt-Based Continual Learning:
Rethinking Obscured Sub-optimality [55.88910947643436]
Self-supervised pre-training is essential for handling vast quantities of unlabeled data in practice.
HiDe-Prompt is an innovative approach that explicitly optimize the hierarchical components with an ensemble of task-specific prompts and statistics.
Our experiments demonstrate the superior performance of HiDe-Prompt and its robustness to pre-training paradigms in continual learning.
arXiv Detail & Related papers (2023-10-11T06:51:46Z) - Reinforcement Learning with a Terminator [80.34572413850186]
We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds.
We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret.
arXiv Detail & Related papers (2022-05-30T18:40:28Z) - Learning to Guide a Saturation-Based Theorem Prover [9.228237801323042]
TRAIL is a deep learning-based approach to theorem proving that characterizes core elements of saturation-based theorem proving within a neural framework.
To the best of our knowledge, TRAIL is the first reinforcement learning-based approach to exceed the performance of a state-of-the-art traditional theorem prover.
arXiv Detail & Related papers (2021-06-07T18:35:57Z) - Learning to Prove from Synthetic Theorems [41.74768503409581]
A major challenge in applying machine learning to automated theorem proving is the scarcity of training data.
We propose an approach that relies on training with synthetic theorems, generated from a set of axioms.
We show that such theorems can be used to train an automated prover and that the learned prover transfers successfully to human-generated theorems.
arXiv Detail & Related papers (2020-06-19T17:48:09Z) - Learning to Prove Theorems by Learning to Generate Theorems [71.46963489866596]
We learn a neural generator that automatically synthesizes theorems and proofs for the purpose of training a theorem prover.
Experiments on real-world tasks demonstrate that synthetic data from our approach improves the theorem prover.
arXiv Detail & Related papers (2020-02-17T16:06:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.