Related papers: LemmaHead: RAG Assisted Proof Generation Using Large Language Models

LemmaHead: RAG Assisted Proof Generation Using Large Language Models

URL: http://arxiv.org/abs/2501.15797v4
Date: Mon, 10 Feb 2025 05:31:43 GMT
Title: LemmaHead: RAG Assisted Proof Generation Using Large Language Models
Authors: Tianbo Yang, Mingqi Yan, Hongyi Zhao, Tianshuo Yang,
Abstract summary: We develop LemmaHead, a knowledge base that supplements queries to the model with relevant mathematical context.<n>To measure our model's performance in mathematical reasoning, our testing paradigm focuses on the task of automated theorem proving.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Developing the logic necessary to solve mathematical problems or write mathematical proofs is one of the more difficult objectives for large language models (LLMS). Currently, the most popular methods in literature consists of fine-tuning the model on written mathematical content such as academic publications and textbooks, so that the model can learn to emulate the style of mathematical writing. In this project, we explore the effectiveness of using retrieval augmented generation (RAG) to address gaps in the mathematical reasoning of LLMs. We develop LemmaHead, a RAG knowledge base that supplements queries to the model with relevant mathematical context, with particular focus on context from published textbooks. To measure our model's performance in mathematical reasoning, our testing paradigm focuses on the task of automated theorem proving via generating proofs to a given mathematical claim in the Lean formal language.

Related papers

MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task [49.355810887265925]
We introduce MathFimer, a novel framework for mathematical reasoning step expansion. We develop a specialized model, MathFimer-7B, on our carefully curated NuminaMath-FIM dataset. We then apply these models to enhance existing mathematical reasoning datasets by inserting detailed intermediate steps into their solution chains.
arXiv Detail & Related papers (2025-02-17T11:22:24Z)
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs [57.48325300739872]
Leveraging mathematical Large Language Models for proof generation is a fundamental topic in LLMs research. We argue that the ability of current LLMs to prove statements largely depends on whether they have encountered the relevant proof process during training. Inspired by the pedagogical method of "proof by counterexamples" commonly used in human mathematics education, our work aims to enhance LLMs' ability to conduct mathematical reasoning and proof through counterexamples.
arXiv Detail & Related papers (2025-02-12T02:01:10Z)
Large Language Models for Mathematical Analysis [3.7325315394927023]
This work addresses critical gaps in mathematical reasoning and contributes to advancing trustworthy AI.<n>We developed the DEMI-MathAnalysis dataset, comprising proof-based problems from mathematical analysis topics.<n>We also designed a guiding framework to rigorously enhance LLMs' ability to solve these problems.
arXiv Detail & Related papers (2024-12-28T20:37:55Z)
LeanAgent: Lifelong Learning for Formal Theorem Proving [85.39415834798385]
We present LeanAgent, a novel lifelong learning framework for formal theorem proving. LeanAgent continuously generalizes to and improves on ever-expanding mathematical knowledge. It successfully proves 155 theorems previously unproved formally by humans across 23 diverse Lean repositories.
arXiv Detail & Related papers (2024-10-08T17:11:24Z)
math-PVS: A Large Language Model Framework to Map Scientific Publications to PVS Theories [10.416375584563728]
This work investigates the applicability of large language models (LLMs) in formalizing advanced mathematical concepts. We envision an automated process, called emphmath-PVS, to extract and formalize mathematical theorems from research papers.
arXiv Detail & Related papers (2023-10-25T23:54:04Z)
A Survey of Deep Learning for Mathematical Reasoning [71.88150173381153]
We review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade. Recent advances in large-scale neural language models have opened up new benchmarks and opportunities to use deep learning for mathematical reasoning.
arXiv Detail & Related papers (2022-12-20T18:46:16Z)
NaturalProofs: Mathematical Theorem Proving in Natural Language [132.99913141409968]
We develop NaturalProofs, a multi-domain corpus of mathematical statements and their proofs. NaturalProofs unifies broad coverage, deep coverage, and low-resource mathematical sources. We benchmark strong neural methods on mathematical reference retrieval and generation tasks.
arXiv Detail & Related papers (2021-03-24T03:14:48Z)
Generative Language Modeling for Automated Theorem Proving [94.01137612934842]
This work is motivated by the possibility that a major limitation of automated theorem provers compared to humans might be addressable via generation from language models. We present an automated prover and proof assistant, GPT-f, for the Metamath formalization language, and analyze its performance.
arXiv Detail & Related papers (2020-09-07T19:50:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.