Advancing Mathematical Research via Human-AI Interactive Theorem Proving
- URL: http://arxiv.org/abs/2512.09443v2
- Date: Thu, 11 Dec 2025 13:10:50 GMT
- Title: Advancing Mathematical Research via Human-AI Interactive Theorem Proving
- Authors: Chenyi Li, Zhijian Lai, Dong An, Jiang Hu, Zaiwen Wen,
- Abstract summary: We propose a human-in-the-loop workflow for interactive theorem proving and discovery with LLMs.<n>Human experts retain control over problem formulation and admissible assumptions, while the model searches for proofs or contradictions.<n>We instantiate this workflow in a case study on the connection between manifold optimization and Grover's quantum search algorithm.
- Score: 16.40852561664514
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We investigate how large language models can be used as research tools in scientific computing while preserving mathematical rigor. We propose a human-in-the-loop workflow for interactive theorem proving and discovery with LLMs. Human experts retain control over problem formulation and admissible assumptions, while the model searches for proofs or contradictions, proposes candidate properties and theorems, and helps construct structures and parameters that satisfy explicit constraints, supported by numerical experiments and simple verification checks. Experts treat these outputs as raw material, further refine them, and organize the results into precise statements and rigorous proofs. We instantiate this workflow in a case study on the connection between manifold optimization and Grover's quantum search algorithm, where the pipeline helps identify invariant subspaces, explore Grover-compatible retractions, and obtain convergence guarantees for the retraction-based gradient method. The framework provides a practical template for integrating large language models into frontier mathematical research, enabling faster exploration of proof space and algorithm design while maintaining transparent reasoning responsibilities. Although illustrated on manifold optimization problems in quantum computing, the principles extend to other core areas of scientific computing.
Related papers
- EternalMath: A Living Benchmark of Frontier Mathematics that Evolves with Human Discovery [23.517907682810932]
We propose a fully automated, theorem-grounded pipeline for evaluating frontier mathematical reasoning.<n>The pipeline transforms recent peer-reviewed mathematical literature into executable and verifiable reasoning tasks.<n>Applying this pipeline yields textbfEternalMath, an evolving evaluation suite derived from contemporary research papers.
arXiv Detail & Related papers (2026-01-04T06:40:25Z) - Where to Search: Measure the Prior-Structured Search Space of LLM Agents [0.8249180979158818]
This paper proposes a compact formal theory that describes and measures LLM-assisted iterative search guided by domain priors.<n>We represent an agent as a fuzzy relation operator on inputs and outputs to capture feasible transitions.<n>We provide the simplest testable inferences and validate them via two instantiation.
arXiv Detail & Related papers (2025-10-16T16:18:37Z) - The Computational Complexity of Circuit Discovery for Inner Interpretability [0.30723404270319693]
We study circuit discovery with classical and parameterized computational complexity theory.<n>Our findings reveal a challenging complexity landscape.<n>This framework allows us to understand the scope and limits of interpretability queries.
arXiv Detail & Related papers (2024-10-10T15:22:48Z) - Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models [56.32800938317095]
Existing verifiers are sub-optimal for tree search techniques at test time.<n>We propose token-supervised value models (TVMs)<n>TVMs assign each token a probability that reflects the likelihood of reaching the correct final answer.
arXiv Detail & Related papers (2024-07-12T13:16:50Z) - math-PVS: A Large Language Model Framework to Map Scientific
Publications to PVS Theories [10.416375584563728]
This work investigates the applicability of large language models (LLMs) in formalizing advanced mathematical concepts.
We envision an automated process, called emphmath-PVS, to extract and formalize mathematical theorems from research papers.
arXiv Detail & Related papers (2023-10-25T23:54:04Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Validation Diagnostics for SBI algorithms based on Normalizing Flows [55.41644538483948]
This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF.
It also offers theoretical guarantees based on results of local consistency.
This work should help the design of better specified models or drive the development of novel SBI-algorithms.
arXiv Detail & Related papers (2022-11-17T15:48:06Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Partial Counterfactual Identification from Observational and
Experimental Data [83.798237968683]
We develop effective Monte Carlo algorithms to approximate the optimal bounds from an arbitrary combination of observational and experimental data.
Our algorithms are validated extensively on synthetic and real-world datasets.
arXiv Detail & Related papers (2021-10-12T02:21:30Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.