Related papers: Instantiation-based Formalization of Logical Reasoning Tasks using Language Models and Logical Solvers

Instantiation-based Formalization of Logical Reasoning Tasks using Language Models and Logical Solvers

URL: http://arxiv.org/abs/2501.16961v2
Date: Thu, 01 May 2025 10:16:10 GMT
Title: Instantiation-based Formalization of Logical Reasoning Tasks using Language Models and Logical Solvers
Authors: Mohammad Raza, Natasa Milic-Frayling,
Abstract summary: We introduce Semantic Self-Verification (SSV), a novel approach to accurately formulate the reasoning problem from natural language to the formal language of the solver.<n>SSV uses a consistency-based approach to produce strong abstract formalizations of problems using concrete instantiations that are generated by the model and verified by the solver.<n>We propose such *near-certain reasoning* as a new approach to reduce the need for manual verification in many cases, taking us closer to more dependable and autonomous AI reasoning systems.
Score: 4.897782942277061
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Robustness of reasoning remains a significant challenge for large language models, and addressing it is essential for the practical applicability of AI-driven reasoning systems. We introduce Semantic Self-Verification (SSV), a novel approach that addresses the key challenge in combining language models with the rigor of logical solvers: to accurately formulate the reasoning problem from natural language to the formal language of the solver. SSV uses a consistency-based approach to produce strong abstract formalizations of problems using concrete instantiations that are generated by the model and verified by the solver. In addition to significantly advancing the overall reasoning accuracy over the state-of-the-art, a key novelty that this approach presents is a feature of verification that has near-perfect precision over a significant coverage of cases, as we demonstrate on open reasoning benchmarks. We propose such *near-certain reasoning* as a new approach to reduce the need for manual verification in many cases, taking us closer to more dependable and autonomous AI reasoning systems.

Related papers

Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1) [66.51642638034822]
Reasoning is central to human intelligence, enabling structured problem-solving across diverse tasks. Recent advances in large language models (LLMs) have greatly enhanced their reasoning abilities in arithmetic, commonsense, and symbolic domains. This paper offers a concise yet insightful overview of reasoning techniques in both textual and multimodal LLMs.
arXiv Detail & Related papers (2025-04-04T04:04:56Z)
Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique [66.94905631175209]
We propose a novel inference-time scaling approach -- stepwise natural language self-critique (PANEL) It employs self-generated natural language critiques as feedback to guide the step-level search process. This approach bypasses the need for task-specific verifiers and the associated training overhead.
arXiv Detail & Related papers (2025-03-21T17:59:55Z)
Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal Reasoning [19.477062052536887]
We propose the Logical-Semantic Integration Model (LSIM), a supervised framework that bridges semantic and logical coherence. LSIM comprises three components: reinforcement learning predicts a structured fact-rule chain for each question, a trainable Deep Structured Semantic Model (DSSM) retrieves the most relevant candidate questions and in-answer learning generates the final answer. Our experiments on a real-world legal dataset QA-validated through both automated metrics and human evaluation-demonstrate that LSIM significantly enhances accuracy and reliability compared to existing methods.
arXiv Detail & Related papers (2025-02-11T19:33:07Z)
Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation [24.081573908824353]
First-order logic (FOL) reasoning is pivotal for intelligent systems. Existing benchmarks often rely on extensive human annotation or handcrafted templates. We propose a novel framework called ProverGen that synergizes the generative strengths of Large Language Models with the rigor and precision of symbolic provers.
arXiv Detail & Related papers (2025-02-10T15:31:54Z)
JustLogic: A Comprehensive Benchmark for Evaluating Deductive Reasoning in Large Language Models [51.99046112135311]
We introduce JustLogic, a synthetically generated deductive reasoning benchmark for rigorous evaluation of Large Language Models. JustLogic is highly complex, capable of generating a diverse range of linguistic patterns, vocabulary, and argument structures. Our experimental results reveal that most state-of-the-art (SOTA) LLMs perform significantly worse than the human average.
arXiv Detail & Related papers (2025-01-24T15:49:10Z)
Proof of Thought : Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning [1.3003982724617653]
Large Language Models (LLMs) have revolutionized natural language processing, yet they struggle with inconsistent reasoning. This research introduces Proof of Thought, a framework that enhances the reliability and transparency of LLM outputs. Key contributions include a robust type system with sort management for enhanced logical integrity, explicit representation of rules for clear distinction between factual and inferential knowledge.
arXiv Detail & Related papers (2024-09-25T18:35:45Z)
A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning [73.77088902676306]
We take a closer look at the self-verification abilities of large language models (LLMs) in the context of logical reasoning. Our main findings suggest that existing LLMs could struggle to identify fallacious reasoning steps accurately and may fall short of guaranteeing the validity of self-verification methods.
arXiv Detail & Related papers (2023-11-14T07:13:10Z)
Language Models can be Logical Solvers [99.40649402395725]
We introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers. LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers.
arXiv Detail & Related papers (2023-11-10T16:23:50Z)
Large Language Models as Analogical Reasoners [155.9617224350088]
Chain-of-thought (CoT) prompting for language models demonstrates impressive performance across reasoning tasks. We introduce a new prompting approach, analogical prompting, designed to automatically guide the reasoning process of large language models.
arXiv Detail & Related papers (2023-10-03T00:57:26Z)
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning. We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z)
Re-Reading Improves Reasoning in Large Language Models [87.46256176508376]
We introduce a simple, yet general and effective prompting method, Re2, to enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs) Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process. We evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality.
arXiv Detail & Related papers (2023-09-12T14:36:23Z)
Modeling and Reasoning in Event Calculus using Goal-Directed Constraint Answer Set Programming [8.677108656718824]
Event Calculus (EC) is a family of formalisms that model commonsense reasoning with a sound, logical basis. Previous attempts to mechanize reasoning using EC faced difficulties in the treatment of the continuous change in dense domains. We show how EC scenarios can be naturally and directly encoded in s(CASP) and how it enables deductive and abductive reasoning tasks.
arXiv Detail & Related papers (2021-06-28T10:43:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.