SymCode: A Neurosymbolic Approach to Mathematical Reasoning via Verifiable Code Generation
- URL: http://arxiv.org/abs/2510.25975v1
- Date: Wed, 29 Oct 2025 21:17:57 GMT
- Title: SymCode: A Neurosymbolic Approach to Mathematical Reasoning via Verifiable Code Generation
- Authors: Sina Bagheri Nezhad, Yao Li, Ameeta Agrawal,
- Abstract summary: We introduce SymCode, a neurosymbolic framework that reframes mathematical problem-solving as a task of verifiable code generation.<n>We evaluate SymCode on challenging benchmarks, including MATH-500 and OlympiadBench, demonstrating significant accuracy improvements.
- Score: 5.88623604115872
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) often struggle with complex mathematical reasoning, where prose-based generation leads to unverified and arithmetically unsound solutions. Current prompting strategies like Chain of Thought still operate within this unreliable medium, lacking a mechanism for deterministic verification. To address these limitations, we introduce SymCode, a neurosymbolic framework that reframes mathematical problem-solving as a task of verifiable code generation using the SymPy library. We evaluate SymCode on challenging benchmarks, including MATH-500 and OlympiadBench, demonstrating significant accuracy improvements of up to 13.6 percentage points over baselines. Our analysis shows that SymCode is not only more token-efficient but also fundamentally shifts model failures from opaque logical fallacies towards transparent, programmatic errors. By grounding LLM reasoning in a deterministic symbolic engine, SymCode represents a key step towards more accurate and trustworthy AI in formal domains.
Related papers
- Imandra CodeLogician: Neuro-Symbolic Reasoning for Precise Analysis of Software Logic [23.59512682324697]
Large Language Models (LLMs) have shown strong performance on code understanding tasks.<n>LLMs lack the ability to perform precise, exhaustive mathematical reasoning about program behavior.<n>We present CodeLogician, a neurosymbolic agent for precise analysis of software logic, integrated with ImandraX.
arXiv Detail & Related papers (2026-01-17T00:16:41Z) - Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning [65.20602712957725]
Caco is a novel framework that automates the synthesis of high-quality, verifiable, and diverse instruction-CoT reasoning data.<n>Our work establishes a paradigm for building self-sustaining, trustworthy reasoning systems without human intervention.
arXiv Detail & Related papers (2025-10-05T07:59:24Z) - Taming Imperfect Process Verifiers: A Sampling Perspective on Backtracking [54.43083499412643]
Test-time algorithms that combine the generative power of language models with process verifiers offer a promising lever for eliciting new reasoning capabilities.<n>We introduce a new process-guided test-time sampling algorithm, VGB, which uses theoretically grounded backtracking to achieve provably better robustness to verifier errors.
arXiv Detail & Related papers (2025-10-03T16:21:14Z) - SciML Agents: Write the Solver, Not the Solution [69.5021018644143]
We introduce two new datasets: a diagnostic dataset of adversarial "misleading" problems; and a large-scale benchmark of 1,000 diverse ODE tasks.<n>We evaluate open- and closed-source LLM models along two axes: (i) unguided versus guided prompting with domain-specific knowledge; and (ii) off-the-shelf versus fine-tuned variants.<n>Preliminary results indicate that careful prompting and fine-tuning can yield a specialized LLM agent capable of reliably solving simple ODE problems.
arXiv Detail & Related papers (2025-09-12T02:53:57Z) - Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models [7.658134651527103]
Worst-case symbolic constraints analysis requires inferring the symbolic constraints that characterise worst-case program executions.<n>We show that even state-of-the-art large language models (LLMs) struggle when applied directly on this task.<n>We propose WARP, an innovative neurosymbolic approach that computes worst-case constraints on smaller concrete input sizes.
arXiv Detail & Related papers (2025-06-09T19:33:30Z) - Chain-of-Code Collapse: Reasoning Failures in LLMs via Adversarial Prompting in Code Generation [0.3495246564946556]
Large Language Models (LLMs) have achieved remarkable success in tasks requiring complex reasoning.<n>Do these models truly reason, or do they merely exploit shallow statistical patterns?<n>We introduce Chain-of-Code Collapse, where we investigate the robustness of reasoning LLMs by introducing a suite of semantically faithful yet adversarially structured prompt perturbations.
arXiv Detail & Related papers (2025-06-08T02:43:46Z) - Computational Thinking Reasoning in Large Language Models [69.28428524878885]
Computational Thinking Model (CTM) is a novel framework that incorporates computational thinking paradigms into large language models (LLMs)<n>Live code execution is seamlessly integrated into the reasoning process, allowing CTM to think by computing.<n>CTM outperforms conventional reasoning models and tool-augmented baselines in terms of accuracy, interpretability, and generalizability.
arXiv Detail & Related papers (2025-06-03T09:11:15Z) - SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning [30.938876549335067]
This paper presents SymRTLO, a novel neuron-symbolic RTL optimization framework.<n>A symbolic module is proposed for analyzing and optimizing finite state machine (FSM) logic.<n> Experiments on the RTL-Rewriter benchmark with Synopsys Design Compiler and Yosys show that SymRTLO improves power, performance, and area (PPA) by up to 43.9%, 62.5%, and 51.1%, respectively.
arXiv Detail & Related papers (2025-04-14T16:15:55Z) - Improving Rule-based Reasoning in LLMs using Neurosymbolic Representations [3.5604294978773265]
Large language models (LLMs) continue to face challenges in reliably solving reasoning tasks.<n>This paper introduces a novel neurosymbolic method that improves LLM reasoning by encoding hidden states into neurosymbolic vectors.
arXiv Detail & Related papers (2025-01-31T20:29:51Z) - Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability [53.51560766150442]
Critical tokens are elements within reasoning trajectories that significantly influence incorrect outcomes.<n>We present a novel framework for identifying these tokens through rollout sampling.<n>We show that identifying and replacing critical tokens significantly improves model accuracy.
arXiv Detail & Related papers (2024-11-29T18:58:22Z) - Linear Temporal Logic Modulo Theories over Finite Traces (Extended
Version) [72.38188258853155]
This paper studies Linear Temporal Logic over Finite Traces (LTLf)
proposition letters are replaced with first-order formulas interpreted over arbitrary theories.
The resulting logic, called Satisfiability Modulo Theories (LTLfMT), is semi-decidable.
arXiv Detail & Related papers (2022-04-28T17:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.