Related papers: Typed Chain-of-Thought: A Curry-Howard Framework for Verifying LLM Reasoning

Typed Chain-of-Thought: A Curry-Howard Framework for Verifying LLM Reasoning

URL: http://arxiv.org/abs/2510.01069v1
Date: Wed, 01 Oct 2025 16:06:40 GMT
Title: Typed Chain-of-Thought: A Curry-Howard Framework for Verifying LLM Reasoning
Authors: Elija Perrier,
Abstract summary: Chain-of-Thought (CoT) prompting enhances the reasoning capabilities of large language models.<n>We propose a novel theoretical lens for this problem grounded in the Curry-Howard correspondence.<n>We operationalise this analogy, presenting methods to extract and map the informal, natural language steps of CoT into a formal, typed proof structure.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While Chain-of-Thought (CoT) prompting enhances the reasoning capabilities of large language models, the faithfulness of the generated rationales remains an open problem for model interpretability. We propose a novel theoretical lens for this problem grounded in the Curry-Howard correspondence, which posits a direct relationship between formal proofs and computer programs. Under this paradigm, a faithful reasoning trace is analogous to a well-typed program, where each intermediate step corresponds to a typed logical inference. We operationalise this analogy, presenting methods to extract and map the informal, natural language steps of CoT into a formal, typed proof structure. Successfully converting a CoT trace into a well-typed proof serves as a strong, verifiable certificate of its computational faithfulness, moving beyond heuristic interpretability towards formal verification. Our framework provides a methodology to transform plausible narrative explanations into formally verifiable programs, offering a path towards building more reliable and trustworthy AI systems.

Related papers

Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification [49.506412445511934]
Large Language Models (LLMs) show remarkable capabilities, yet their next-token prediction creates logical inconsistencies and reward hacking.<n>We introduce a formal logic verification-guided framework that dynamically interleaves formal symbolic verification with the natural language generation process.<n>We operationalize this framework via a novel two-stage training pipeline that synergizes formal logic verification-guided supervised fine-tuning and policy optimization.
arXiv Detail & Related papers (2026-01-30T07:01:25Z)
Translating Informal Proofs into Formal Proofs Using a Chain of States [20.0011959667642]
We address the problem of translating informal mathematical proofs expressed in natural language into formal proofs in Lean4 under a constrained computational budget.<n>We first extract a Chain of States (CoS), a sequence of intermediate formal proof states aligned with the logical structure of the informal argument.<n>We then generate tactics to transition between adjacent states in the CoS, thereby constructing the full formal proof.
arXiv Detail & Related papers (2025-12-11T06:08:34Z)
BRIDGE: Building Representations In Domain Guided Program Verification [67.36686119518441]
BRIDGE decomposes verification into three interconnected domains: Code, Specifications, and Proofs.<n>We show that this approach substantially improves both accuracy and efficiency beyond standard error feedback methods.
arXiv Detail & Related papers (2025-11-26T06:39:19Z)
Are Language Models Efficient Reasoners? A Perspective from Logic Programming [109.47572890883248]
Modern language models (LMs) exhibit strong deductive reasoning capabilities, yet standard evaluations emphasize correctness while overlooking a key aspect of human-like reasoning: efficiency.<n>We propose a framework for assessing LM reasoning efficiency through the lens of logic programming.
arXiv Detail & Related papers (2025-10-29T15:30:31Z)
ProofFlow: A Dependency Graph Approach to Faithful Proof Autoformalization [10.021930888250546]
Current approaches fail to preserve semantic meaning and logical structure of the original human-written argument.<n>We introduce ProofFlow, a novel pipeline that treats structural fidelity as a primary objective.<n>We present a new benchmark of 184 undergraduate-level problems, manually annotated with step-by-step solutions.<n>We also introduce ProofScore, a new composite metric to evaluate syntactic correctness, semantic faithfulness, and structural fidelity.
arXiv Detail & Related papers (2025-10-13T10:20:11Z)
StepProof: Step-by-step verification of natural language mathematical proofs [16.150265021594088]
We propose StepProof, a novel autoformalization method for granular, step-by-step verification.<n>StepProof breaks down complete proofs into multiple verifiable subproofs, enabling sentence-level verification.<n>We show that StepProof significantly improves proof success rates and efficiency compared to traditional methods.
arXiv Detail & Related papers (2025-06-12T10:31:23Z)
Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification [56.218970738892764]
Chain-of-Thought prompting has become the de facto method to elicit reasoning capabilities from large language models (LLMs)<n>To mitigate hallucinations in CoT that are notoriously difficult to detect, current methods operate as opaque boxes and do not provide checkable evidence for their judgments, possibly limiting their effectiveness.<n>We propose a retrospective, step-aware formal verification framework $Safe$. Rather than assigning arbitrary scores, we strive to articulate mathematical claims in formal mathematical language Lean 4 at each reasoning step and provide formal proofs to identify hallucinations.
arXiv Detail & Related papers (2025-06-05T03:16:08Z)
Can Large Language Models Learn Formal Logic? A Data-Driven Training and Evaluation Framework [2.9334627971166336]
This paper investigates the logical reasoning capabilities of large language models (LLMs)<n>A trained LLM receives as input a set of assumptions and a goal, and produces as output a proof that formally derives the goal from the assumptions.<n>A critical obstacle for training is the scarcity of real-world proofs.
arXiv Detail & Related papers (2025-04-28T19:25:29Z)
Lean-STaR: Learning to Interleave Thinking and Proving [53.923617816215774]
We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof.<n>Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment.
arXiv Detail & Related papers (2024-07-14T01:43:07Z)
Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs [5.045988012508899]
Autoformalization seeks to address this by translating proofs written in natural language into a formal representation that is computer-verifiable via interactive theorem provers. We introduce a semantic parsing approach, based on the Universal Transformer architecture, that translates elementary mathematical proofs into an equivalent formalization in the language of the Coq interactive theorem prover.
arXiv Detail & Related papers (2023-01-05T17:56:00Z)
Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals. It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation. It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.