math-PVS: A Large Language Model Framework to Map Scientific
Publications to PVS Theories
- URL: http://arxiv.org/abs/2310.17064v1
- Date: Wed, 25 Oct 2023 23:54:04 GMT
- Title: math-PVS: A Large Language Model Framework to Map Scientific
Publications to PVS Theories
- Authors: Hassen Saidi, Susmit Jha, Tuhin Sahai
- Abstract summary: This work investigates the applicability of large language models (LLMs) in formalizing advanced mathematical concepts.
We envision an automated process, called emphmath-PVS, to extract and formalize mathematical theorems from research papers.
- Score: 10.416375584563728
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As artificial intelligence (AI) gains greater adoption in a wide variety of
applications, it has immense potential to contribute to mathematical discovery,
by guiding conjecture generation, constructing counterexamples, assisting in
formalizing mathematics, and discovering connections between different
mathematical areas, to name a few.
While prior work has leveraged computers for exhaustive mathematical proof
search, recent efforts based on large language models (LLMs) aspire to position
computing platforms as co-contributors in the mathematical research process.
Despite their current limitations in logic and mathematical tasks, there is
growing interest in melding theorem proving systems with foundation models.
This work investigates the applicability of LLMs in formalizing advanced
mathematical concepts and proposes a framework that can critically review and
check mathematical reasoning in research papers. Given the noted reasoning
shortcomings of LLMs, our approach synergizes the capabilities of proof
assistants, specifically PVS, with LLMs, enabling a bridge between textual
descriptions in academic papers and formal specifications in PVS. By harnessing
the PVS environment, coupled with data ingestion and conversion mechanisms, we
envision an automated process, called \emph{math-PVS}, to extract and formalize
mathematical theorems from research papers, offering an innovative tool for
academic review and discovery.
Related papers
- Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning [27.562284768743694]
Large language models (LLMs) can prove mathematical theorems formally by generating proof steps within a proof system.
We introduce a neuro-symbolic tactic generator that synergizes the mathematical intuition learned by LLMs with domain-specific insights encoded by symbolic methods.
We evaluate our framework on 161 challenging inequalities from multiple mathematics competitions.
arXiv Detail & Related papers (2025-02-19T15:54:21Z) - One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs [57.48325300739872]
Leveraging mathematical Large Language Models for proof generation is a fundamental topic in LLMs research.
We argue that the ability of current LLMs to prove statements largely depends on whether they have encountered the relevant proof process during training.
Inspired by the pedagogical method of "proof by counterexamples" commonly used in human mathematics education, our work aims to enhance LLMs' ability to conduct mathematical reasoning and proof through counterexamples.
arXiv Detail & Related papers (2025-02-12T02:01:10Z) - LemmaHead: RAG Assisted Proof Generation Using Large Language Models [0.0]
We develop LemmaHead, a knowledge base that supplements queries to the model with relevant mathematical context.
To measure our model's performance in mathematical reasoning, our testing paradigm focuses on the task of automated theorem proving.
arXiv Detail & Related papers (2025-01-27T05:46:06Z) - Large Language Models for Mathematical Analysis [3.7325315394927023]
This work addresses critical gaps in mathematical reasoning and contributes to advancing trustworthy AI.
We developed the DEMI-MathAnalysis dataset, comprising proof-based problems from mathematical analysis topics.
We also designed a guiding framework to rigorously enhance LLMs' ability to solve these problems.
arXiv Detail & Related papers (2024-12-28T20:37:55Z) - Formal Mathematical Reasoning: A New Frontier in AI [60.26950681543385]
We advocate for formal mathematical reasoning and argue that it is indispensable for advancing AI4Math to the next level.
We summarize existing progress, discuss open challenges, and envision critical milestones to measure future success.
arXiv Detail & Related papers (2024-12-20T17:19:24Z) - LeanAgent: Lifelong Learning for Formal Theorem Proving [85.39415834798385]
We present LeanAgent, a novel lifelong learning framework for formal theorem proving.
LeanAgent continuously generalizes to and improves on ever-expanding mathematical knowledge.
It successfully proves 155 theorems previously unproved formally by humans across 23 diverse Lean repositories.
arXiv Detail & Related papers (2024-10-08T17:11:24Z) - MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark [82.64129627675123]
MathBench is a new benchmark that rigorously assesses the mathematical capabilities of large language models.
MathBench spans a wide range of mathematical disciplines, offering a detailed evaluation of both theoretical understanding and practical problem-solving skills.
arXiv Detail & Related papers (2024-05-20T17:52:29Z) - A Survey of Deep Learning for Mathematical Reasoning [71.88150173381153]
We review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade.
Recent advances in large-scale neural language models have opened up new benchmarks and opportunities to use deep learning for mathematical reasoning.
arXiv Detail & Related papers (2022-12-20T18:46:16Z) - Generative Language Modeling for Automated Theorem Proving [94.01137612934842]
This work is motivated by the possibility that a major limitation of automated theorem provers compared to humans might be addressable via generation from language models.
We present an automated prover and proof assistant, GPT-f, for the Metamath formalization language, and analyze its performance.
arXiv Detail & Related papers (2020-09-07T19:50:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.