LEGO-Prover: Neural Theorem Proving with Growing Libraries
- URL: http://arxiv.org/abs/2310.00656v3
- Date: Fri, 27 Oct 2023 12:44:32 GMT
- Title: LEGO-Prover: Neural Theorem Proving with Growing Libraries
- Authors: Haiming Wang, Huajian Xin, Chuanyang Zheng, Lin Li, Zhengying Liu,
Qingxing Cao, Yinya Huang, Jing Xiong, Han Shi, Enze Xie, Jian Yin, Zhenguo
Li, Heng Liao, Xiaodan Liang
- Abstract summary: We present LEGO-Prover, which employs a growing skill library containing verified lemmas as skills to augment the capability of LLMs used in theorem proving.
By constructing the proof modularly, LEGO-Prover enables LLMs to utilize existing skills retrieved from the library and to create new skills during the proving process.
Our ablation study indicates that these newly added skills are indeed helpful for proving theorems, resulting in an improvement from a success rate of 47.1% to 50.4%.
- Score: 86.1191481712352
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the success of large language models (LLMs), the task of theorem
proving still remains one of the hardest reasoning tasks that is far from being
fully solved. Prior methods using language models have demonstrated promising
results, but they still struggle to prove even middle school level theorems.
One common limitation of these methods is that they assume a fixed theorem
library during the whole theorem proving process. However, as we all know,
creating new useful theorems or even new theories is not only helpful but
crucial and necessary for advancing mathematics and proving harder and deeper
results. In this work, we present LEGO-Prover, which employs a growing skill
library containing verified lemmas as skills to augment the capability of LLMs
used in theorem proving. By constructing the proof modularly, LEGO-Prover
enables LLMs to utilize existing skills retrieved from the library and to
create new skills during the proving process. These skills are further evolved
(by prompting an LLM) to enrich the library on another scale. Modular and
reusable skills are constantly added to the library to enable tackling
increasingly intricate mathematical problems. Moreover, the learned library
further bridges the gap between human proofs and formal proofs by making it
easier to impute missing steps. LEGO-Prover advances the state-of-the-art pass
rate on miniF2F-valid (48.0% to 57.0%) and miniF2F-test (45.5% to 47.1%).
During the proving process, LEGO-Prover also manages to generate over 20,000
skills (theorems/lemmas) and adds them to the growing library. Our ablation
study indicates that these newly added skills are indeed helpful for proving
theorems, resulting in an improvement from a success rate of 47.1% to 50.4%. We
also release our code and all the generated skills.
Related papers
- TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts [26.98890165420689]
Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning.
One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs.
This paper proposes **Theorelama**, an end-to-end framework to train a general-purpose LLM to become a Lean4 expert.
arXiv Detail & Related papers (2024-07-03T15:36:18Z) - ATG: Benchmarking Automated Theorem Generation for Generative Language Models [83.93978859348313]
Humans can develop new theorems to explore broader and more complex mathematical results.
Current generative language models (LMs) have achieved significant improvement in automatically proving theorems.
This paper proposes an Automated Theorem Generation benchmark that evaluates whether an agent can automatically generate valuable (and possibly brand new) theorems.
arXiv Detail & Related papers (2024-05-05T02:06:37Z) - Towards Large Language Models as Copilots for Theorem Proving in Lean [81.94024084598598]
We introduce Lean Copilot, a framework for running Lean inference in large language models.
We build tools for suggesting proof steps, completing intermediate proof goals, and selecting relevant premises.
Experimental results demonstrate the effectiveness of our method in assisting humans and theorem proving process.
arXiv Detail & Related papers (2024-04-18T22:54:08Z) - Enhancing Neural Theorem Proving through Data Augmentation and Dynamic
Sampling Method [1.8130068086063336]
We introduce DS-Prover, a novel dynamic sampling method for theorem proving.
We augment the training dataset by decomposing simplification and rewrite tactics with multiple premises into tactics with single premises.
We achieve a state-of-the-art performance (Pass@1) of 14.2% on the ProofNet dataset and a performance of 29.8% on MiniF2F.
arXiv Detail & Related papers (2023-12-20T09:55:21Z) - Democratizing Reasoning Ability: Tailored Learning from Large Language
Model [97.4921006089966]
We propose a tailored learning approach to distill such reasoning ability to smaller LMs.
We exploit the potential of LLM as a reasoning teacher by building an interactive multi-round learning paradigm.
To exploit the reasoning potential of the smaller LM, we propose self-reflection learning to motivate the student to learn from self-made mistakes.
arXiv Detail & Related papers (2023-10-20T07:50:10Z) - LeanDojo: Theorem Proving with Retrieval-Augmented Language Models [72.54339382005732]
Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean.
Existing methods are difficult to reproduce or build on, due to private code, data, and compute requirements.
This paper introduces LeanDojo: an open-source Lean toolkit consisting of toolkits, data, models.
We develop ReProver: an LLM-based prover augmented with retrieval for selecting premises from a vast math library.
arXiv Detail & Related papers (2023-06-27T17:05:32Z) - Learning to Prove Theorems by Learning to Generate Theorems [71.46963489866596]
We learn a neural generator that automatically synthesizes theorems and proofs for the purpose of training a theorem prover.
Experiments on real-world tasks demonstrate that synthetic data from our approach improves the theorem prover.
arXiv Detail & Related papers (2020-02-17T16:06:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.