Related papers: Mathematical Formalized Problem Solving and Theorem Proving in Different Fields in Lean 4

Mathematical Formalized Problem Solving and Theorem Proving in Different Fields in Lean 4

URL: http://arxiv.org/abs/2409.05977v2
Date: Thu, 31 Oct 2024 16:01:59 GMT
Title: Mathematical Formalized Problem Solving and Theorem Proving in Different Fields in Lean 4
Authors: Xichen Tang,
Abstract summary: Using computerized formal languages like Lean 4 to prove mathematical theorems has a significant impact on mathematical formalization. This paper introduces the basic structure and tactics in general, determine how AI can assist the mathematical formalization process to improve its performance.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Using computerized verifiable formal languages like Lean 4 to prove mathematical theorems has a significant impact on mathematical formalization. Lean 4 offers prominent potential for advancing mathematical reasoning. However, existing efforts are limited to mathematical formalization languages in substantial online corpora and are dedicated to keeping pace with rapidly evolving languages. To bridge the gap between the traditional and computerized proof, my approach to formalizing theorem proving involves generating formal steps and complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs. The method is to introduce the basic structure and tactics in general, determine how AI can assist the mathematical formalization process to improve its performance, and give examples of solving problems in Lean 4 comparing to NL, mainly in IMO, and a sample theorem proving in abstract algebra.

Related papers

Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification [56.218970738892764]
Chain-of-Thought prompting has become the de facto method to elicit reasoning capabilities from large language models (LLMs)<n>To mitigate hallucinations in CoT that are notoriously difficult to detect, current methods operate as opaque boxes and do not provide checkable evidence for their judgments, possibly limiting their effectiveness.<n>We propose a retrospective, step-aware formal verification framework $Safe$. Rather than assigning arbitrary scores, we strive to articulate mathematical claims in formal mathematical language Lean 4 at each reasoning step and provide formal proofs to identify hallucinations.
arXiv Detail & Related papers (2025-06-05T03:16:08Z)
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models [17.919212265668783]
We present FormalMATH, a large-scale Lean4 benchmark comprising 5,560 formally verified problems spanning from high-school Olympiad challenges to undergraduate-level theorems.<n>We introduce a novel human-in-the-loop autoformalization pipeline that integrates statement autoformalization, semantic verification, and negation-based disproof filtering strategies.<n>Our evaluation of state-of-the-art LLM-based theorem provers reveals significant limitations.
arXiv Detail & Related papers (2025-05-05T15:37:00Z)
LemmaHead: RAG Assisted Proof Generation Using Large Language Models [0.0]
We develop LemmaHead, a knowledge base that supplements queries to the model with relevant mathematical context. To measure our model's performance in mathematical reasoning, our testing paradigm focuses on the task of automated theorem proving.
arXiv Detail & Related papers (2025-01-27T05:46:06Z)
Formal Mathematical Reasoning: A New Frontier in AI [60.26950681543385]
We advocate for formal mathematical reasoning and argue that it is indispensable for advancing AI4Math to the next level. We summarize existing progress, discuss open challenges, and envision critical milestones to measure future success.
arXiv Detail & Related papers (2024-12-20T17:19:24Z)
Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation [71.32761934724867]
This work proposes Alchemy, a framework for data synthesis that constructs formal theorems through symbolic mutation. For each candidate theorem in Mathlib, we identify all invocable theorems that can be used to rewrite or apply to it. As a result, our method increases the number of theorems in Mathlib by an order of magnitude, from 110k to 6M.
arXiv Detail & Related papers (2024-10-21T08:04:21Z)
LeanAgent: Lifelong Learning for Formal Theorem Proving [85.39415834798385]
We present LeanAgent, a novel lifelong learning framework for formal theorem proving. LeanAgent continuously generalizes to and improves on ever-expanding mathematical knowledge. It successfully proves 155 theorems previously unproved formally by humans across 23 diverse Lean repositories.
arXiv Detail & Related papers (2024-10-08T17:11:24Z)
Lean-STaR: Learning to Interleave Thinking and Proving [53.923617816215774]
We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof. Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment.
arXiv Detail & Related papers (2024-07-14T01:43:07Z)
Autoformalizing Euclidean Geometry [74.72212706513318]
We introduce a neuro-symbolic framework for autoformalizing Euclidean geometry. One challenge is that informal proofs rely on diagrams, leaving gaps in texts that are hard to formalize. We provide automatic semantic evaluation for autoformalized theorem statements.
arXiv Detail & Related papers (2024-05-27T14:35:10Z)
EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages [0.0]
Formal mathematics is the discipline of translating mathematics into a programming language. This paper introduces an evolutionary framework for the first systematic quantitative analysis of the differential machine learnability of five formal math corpora.
arXiv Detail & Related papers (2024-02-12T19:10:11Z)
math-PVS: A Large Language Model Framework to Map Scientific Publications to PVS Theories [10.416375584563728]
This work investigates the applicability of large language models (LLMs) in formalizing advanced mathematical concepts. We envision an automated process, called emphmath-PVS, to extract and formalize mathematical theorems from research papers.
arXiv Detail & Related papers (2023-10-25T23:54:04Z)
A New Approach Towards Autoformalization [7.275550401145199]
Autoformalization is the task of translating natural language mathematics into a formal language that can be verified by a program. Research paper mathematics requires large amounts of background and context. We propose an avenue towards tackling autoformalization for research-level mathematics, by breaking the task into easier and more approachable subtasks.
arXiv Detail & Related papers (2023-10-12T00:50:24Z)
Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs [5.045988012508899]
Autoformalization seeks to address this by translating proofs written in natural language into a formal representation that is computer-verifiable via interactive theorem provers. We introduce a semantic parsing approach, based on the Universal Transformer architecture, that translates elementary mathematical proofs into an equivalent formalization in the language of the Coq interactive theorem prover.
arXiv Detail & Related papers (2023-01-05T17:56:00Z)
NaturalProofs: Mathematical Theorem Proving in Natural Language [132.99913141409968]
We develop NaturalProofs, a multi-domain corpus of mathematical statements and their proofs. NaturalProofs unifies broad coverage, deep coverage, and low-resource mathematical sources. We benchmark strong neural methods on mathematical reference retrieval and generation tasks.
arXiv Detail & Related papers (2021-03-24T03:14:48Z)
Generative Language Modeling for Automated Theorem Proving [94.01137612934842]
This work is motivated by the possibility that a major limitation of automated theorem provers compared to humans might be addressable via generation from language models. We present an automated prover and proof assistant, GPT-f, for the Metamath formalization language, and analyze its performance.
arXiv Detail & Related papers (2020-09-07T19:50:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.