SITA: A Framework for Structure-to-Instance Theorem Autoformalization
- URL: http://arxiv.org/abs/2511.10356v1
- Date: Fri, 14 Nov 2025 01:47:08 GMT
- Title: SITA: A Framework for Structure-to-Instance Theorem Autoformalization
- Authors: Chenyi Li, Wanli Ma, Zichen Wang, Zaiwen Wen,
- Abstract summary: We develop a framework for structure-to-instance theorem autoformalization (SITA)<n>SITA bridges the gap between abstract mathematical theories and their concrete applications in Lean proof assistant.<n>We generate corresponding Lean definitions and instance declarations, integrate them using Lean's typeclass mechanism, and construct verified theorems by checking structural assumptions.
- Score: 20.941850622174236
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While large language models (LLMs) have shown progress in mathematical reasoning, they still face challenges in formalizing theorems that arise from instantiating abstract structures in concrete settings. With the goal of auto-formalizing mathematical results at the research level, we develop a framework for structure-to-instance theorem autoformalization (SITA), which systematically bridges the gap between abstract mathematical theories and their concrete applications in Lean proof assistant. Formalized abstract structures are treated as modular templates that contain definitions, assumptions, operations, and theorems. These templates serve as reusable guides for the formalization of concrete instances. Given a specific instantiation, we generate corresponding Lean definitions and instance declarations, integrate them using Lean's typeclass mechanism, and construct verified theorems by checking structural assumptions. We incorporate LLM-based generation with feedback-guided refinement to ensure both automation and formal correctness. Experiments on a dataset of optimization problems demonstrate that SITA effectively formalizes diverse instances grounded in abstract structures.
Related papers
- Structure-R1: Dynamically Leveraging Structural Knowledge in LLM Reasoning through Reinforcement Learning [29.722512436773638]
We propose textscStructure-R1, a framework that transforms retrieved content into structured representations optimized for reasoning.<n>We show that textscStructure-R1 consistently achieves competitive performance with a 7B-scale backbone model.<n>Our theoretical analysis demonstrates how structured representations enhance reasoning by improving information density and contextual clarity.
arXiv Detail & Related papers (2025-10-16T23:19:28Z) - Implicit Reasoning in Large Language Models: A Comprehensive Survey [67.53966514728383]
Large Language Models (LLMs) have demonstrated strong generalization across a wide range of tasks.<n>Recent studies have shifted attention from explicit chain-of-thought prompting toward implicit reasoning.<n>This survey introduces a taxonomy centered on execution paradigms, shifting the focus from representational forms to computational strategies.
arXiv Detail & Related papers (2025-09-02T14:16:02Z) - DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning [67.93945726549289]
DeepTheorem is a comprehensive informal theorem-proving framework exploiting natural language to enhance mathematical reasoning.<n>DeepTheorem includes a large-scale benchmark dataset consisting of 121K high-quality IMO-level informal theorems and proofs.<n>We devise a novel reinforcement learning strategy (RL-Zero) explicitly tailored to informal theorem proving, leveraging the verified theorem variants to incentivize robust mathematical inference.
arXiv Detail & Related papers (2025-05-29T17:59:39Z) - Steering LLMs for Formal Theorem Proving [1.083373217040891]
We develop an inference-time intervention that identifies linear directions in residual activations associated with informal reasoning traces.<n>This mechanism also yields interpretable information about how reasoning is internally encoded in the activation space of Large Language Models.<n>Our contributions are twofold: (1) a novel activation-based intervention for guiding proof synthesis in LLMs; and (2) demonstration that this intervention improves performance under two decoding strategies.
arXiv Detail & Related papers (2025-02-21T15:04:48Z) - Exploring the Role of Reasoning Structures for Constructing Proofs in Multi-Step Natural Language Reasoning with Large Language Models [30.09120709652445]
This paper is centred around a focused study: whether the current state-of-the-art generalist LLMs can leverage the structures in a few examples to better construct the proof structures with textitin-context learning.
arXiv Detail & Related papers (2024-10-11T00:45:50Z) - Autoformalizing Euclidean Geometry [74.72212706513318]
We introduce a neuro-symbolic framework for autoformalizing Euclidean geometry.
One challenge is that informal proofs rely on diagrams, leaving gaps in texts that are hard to formalize.
We provide automatic semantic evaluation for autoformalized theorem statements.
arXiv Detail & Related papers (2024-05-27T14:35:10Z) - Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure [66.33623392497599]
We show that a structure called template-content structure (T-C structure) can reduce the possible space from exponential level to linear level.
We demonstrate that models can achieve task composition, further reducing the space needed to learn from linear to logarithmic.
arXiv Detail & Related papers (2023-10-09T06:57:45Z) - Query Structure Modeling for Inductive Logical Reasoning Over Knowledge
Graphs [67.043747188954]
We propose a structure-modeled textual encoding framework for inductive logical reasoning over KGs.
It encodes linearized query structures and entities using pre-trained language models to find answers.
We conduct experiments on two inductive logical reasoning datasets and three transductive datasets.
arXiv Detail & Related papers (2023-05-23T01:25:29Z) - Unifying Structure Reasoning and Language Model Pre-training for Complex
Reasoning [26.811507121199323]
This paper proposes a unified learning framework that combines explicit structure reasoning and language pre-training to endow PLMs with the structure reasoning skill.
It first identifies several elementary structures within contexts to construct structured queries and performs step-by-step reasoning along the queries to identify the answer entity.
Experimental results on four datasets demonstrate that the proposed model achieves significant improvements in complex reasoning tasks involving diverse structures.
arXiv Detail & Related papers (2023-01-21T08:18:11Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.