Related papers: Generative Language Modeling for Automated Theorem Proving

Generative Language Modeling for Automated Theorem Proving

URL: http://arxiv.org/abs/2009.03393v1
Date: Mon, 7 Sep 2020 19:50:10 GMT
Title: Generative Language Modeling for Automated Theorem Proving
Authors: Stanislas Polu, Ilya Sutskever
Abstract summary: This work is motivated by the possibility that a major limitation of automated theorem provers compared to humans might be addressable via generation from language models. We present an automated prover and proof assistant, GPT-f, for the Metamath formalization language, and analyze its performance.
Score: 94.01137612934842
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We explore the application of transformer-based language models to automated theorem proving. This work is motivated by the possibility that a major limitation of automated theorem provers compared to humans -- the generation of original mathematical terms -- might be addressable via generation from language models. We present an automated prover and proof assistant, GPT-f, for the Metamath formalization language, and analyze its performance. GPT-f found new short proofs that were accepted into the main Metamath library, which is to our knowledge, the first time a deep-learning based system has contributed proofs that were adopted by a formal mathematics community.

Related papers

LemmaHead: RAG Assisted Proof Generation Using Large Language Models [0.0]
We develop LemmaHead, a knowledge base that supplements queries to the model with relevant mathematical context. To measure our model's performance in mathematical reasoning, our testing paradigm focuses on the task of automated theorem proving.
arXiv Detail & Related papers (2025-01-27T05:46:06Z)
Lean-STaR: Learning to Interleave Thinking and Proving [53.923617816215774]
We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof. Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment.
arXiv Detail & Related papers (2024-07-14T01:43:07Z)
Autoformalizing Euclidean Geometry [74.72212706513318]
We introduce a neuro-symbolic framework for autoformalizing Euclidean geometry. One challenge is that informal proofs rely on diagrams, leaving gaps in texts that are hard to formalize. We provide automatic semantic evaluation for autoformalized theorem statements.
arXiv Detail & Related papers (2024-05-27T14:35:10Z)
math-PVS: A Large Language Model Framework to Map Scientific Publications to PVS Theories [10.416375584563728]
This work investigates the applicability of large language models (LLMs) in formalizing advanced mathematical concepts. We envision an automated process, called emphmath-PVS, to extract and formalize mathematical theorems from research papers.
arXiv Detail & Related papers (2023-10-25T23:54:04Z)
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models [68.65075559137608]
We propose TRIGO, an ATP benchmark that not only requires a model to reduce a trigonometric expression with step-by-step proofs but also evaluates a generative LM's reasoning ability on formulas. We gather trigonometric expressions and their reduced forms from the web, annotate the simplification process manually, and translate it into the Lean formal language system. We develop an automatic generator based on Lean-Gym to create dataset splits of varying difficulties and distributions in order to thoroughly analyze the model's generalization ability.
arXiv Detail & Related papers (2023-10-16T08:42:39Z)
A New Approach Towards Autoformalization [7.275550401145199]
Autoformalization is the task of translating natural language mathematics into a formal language that can be verified by a program. Research paper mathematics requires large amounts of background and context. We propose an avenue towards tackling autoformalization for research-level mathematics, by breaking the task into easier and more approachable subtasks.
arXiv Detail & Related papers (2023-10-12T00:50:24Z)
NaturalProver: Grounded Mathematical Proof Generation with Language Models [84.2064569475095]
Theorem proving in natural mathematical language plays a central role in mathematical advances and education. We develop NaturalProver, a language model that generates proofs by conditioning on background references. NaturalProver is capable of proving some theorems that require short (2-6 step) proofs, and providing next-step suggestions that are rated as correct and useful over 40% of the time.
arXiv Detail & Related papers (2022-05-25T17:01:18Z)
Autoformalization with Large Language Models [22.86710743804944]
A successful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence. We show large language models provide new prospects towards this goal. Our methodology results in a new state-of-the-art result on the MiniF2F theorem proving benchmark, improving the proof rate from $29.6%$ to $35.2%$.
arXiv Detail & Related papers (2022-05-25T09:53:30Z)
Proof Artifact Co-training for Theorem Proving with Language Models [4.934817254755007]
PACT (bf Proof bf Artifact bf Co-bf Training) is a general methodology for extracting self-supervised data from kernel-level proof terms for co-training. We instrument Lean with a neural theorem prover driven by a Transformer language model and show that PACT improves theorem proving success rate on a held-out suite of test theorems from 32% to 48%.
arXiv Detail & Related papers (2021-02-11T18:59:24Z)
Learning to Prove Theorems by Learning to Generate Theorems [71.46963489866596]
We learn a neural generator that automatically synthesizes theorems and proofs for the purpose of training a theorem prover. Experiments on real-world tasks demonstrate that synthetic data from our approach improves the theorem prover.
arXiv Detail & Related papers (2020-02-17T16:06:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.