Related papers: A Syllogistic Probe: Tracing the Evolution of Logic Reasoning in Large Language Models

A Syllogistic Probe: Tracing the Evolution of Logic Reasoning in Large Language Models

URL: http://arxiv.org/abs/2601.17426v1
Date: Sat, 24 Jan 2026 11:51:52 GMT
Title: A Syllogistic Probe: Tracing the Evolution of Logic Reasoning in Large Language Models
Authors: Zhengqing Zang, Yuqi Ding, Yanmei Gu, Changkai Song, Zhengkai Yang, Guoping Du, Junbo Zhao, Haobo Wang,
Abstract summary: We explore whether large language models (LLMs) exhibit a similar evolution in the underlying logical framework.<n>Using existential import as a probe, we evaluate syllogism under traditional and modern logic.
Score: 17.118221176971982
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Human logic has gradually shifted from intuition-driven inference to rigorous formal systems. Motivated by recent advances in large language models (LLMs), we explore whether LLMs exhibit a similar evolution in the underlying logical framework. Using existential import as a probe, we for evaluate syllogism under traditional and modern logic. Through extensive experiments of testing SOTA LLMs on a new syllogism dataset, we have some interesting findings: (i) Model size scaling promotes the shift toward modern logic; (ii) Thinking serves as an efficient accelerator beyond parameter scaling; (iii) the Base model plays a crucial role in determining how easily and stably this shift can emerge. Beyond these core factors, we conduct additional experiments for in-depth analysis of properties of current LLMs on syllogistic reasoning.

Related papers

Teaching Small Language Models to Learn Logic through Meta-Learning [4.923078123348596]
Small models (1.5B-7B) fine-tuned with meta-learning demonstrate strong gains in generalization.<n>These meta-learned models outperform GPT-4o and o3-mini on our syllogistic reasoning task.
arXiv Detail & Related papers (2025-05-20T13:00:48Z)
A Survey of Scaling in Large Language Model Reasoning [62.92861523305361]
We provide a comprehensive examination of scaling in large Language models (LLMs) reasoning.<n>We analyze scaling in reasoning steps that improves multi-step inference and logical consistency.<n>We discuss scaling in training-enabled reasoning, focusing on optimization through iterative model improvement.
arXiv Detail & Related papers (2025-04-02T23:51:27Z)
Cognitive Activation and Chaotic Dynamics in Large Language Models: A Quasi-Lyapunov Analysis of Reasoning Mechanisms [6.375329734462518]
This paper proposes the "Cognitive Activation" theory, revealing the essence of Large Language Models' reasoning mechanisms.<n> Experiments show that the model's information accumulation follows a nonlinear exponential law, and the Multilayer Perceptron (MLP) accounts for a higher proportion in the final output.<n>This research provides a chaos theory framework for the interpretability of LLMs' reasoning and reveals potential pathways for balancing creativity and reliability in model design.
arXiv Detail & Related papers (2025-03-15T08:15:10Z)
LogiDynamics: Unraveling the Dynamics of Inductive, Abductive and Deductive Logical Inferences in LLM Reasoning [74.0242521818214]
This paper systematically investigates the comparative dynamics of inductive (System 1) versus abductive/deductive (System 2) inference in large language models (LLMs)<n>We utilize a controlled analogical reasoning environment, varying modality (textual, visual, symbolic), difficulty, and task format (MCQ / free-text)<n>Our analysis reveals System 2 pipelines generally excel, particularly in visual/symbolic modalities and harder tasks, while System 1 is competitive for textual and easier problems.
arXiv Detail & Related papers (2025-02-16T15:54:53Z)
Logical Reasoning in Large Language Models: A Survey [17.06712393613964]
This survey synthesizes recent advancements in logical reasoning in large language models (LLMs)<n>It outlines the scope of logical reasoning in LLMs, its theoretical foundations, and the benchmarks used to evaluate reasoning proficiency.<n>The review concludes with future directions, emphasizing the need for further exploration to strengthen logical reasoning in AI systems.
arXiv Detail & Related papers (2025-02-13T09:19:14Z)
JustLogic: A Comprehensive Benchmark for Evaluating Deductive Reasoning in Large Language Models [51.99046112135311]
We introduce JustLogic, a synthetically generated deductive reasoning benchmark for rigorous evaluation of Large Language Models (LLMs)<n>JustLogic is highly complex, capable of generating a diverse range of linguistic patterns, vocabulary, and argument structures.<n>Our experimental results reveal that (i) state-of-the-art (SOTA) reasoning LLMs perform on par or better than the human average but significantly worse than the human ceiling.
arXiv Detail & Related papers (2025-01-24T15:49:10Z)
CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark. In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship. We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z)
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning. We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z)
Exploring Self-supervised Logic-enhanced Training for Large Language Models [59.227222647741094]
In this paper, we make the first attempt to investigate the feasibility of incorporating logical knowledge through self-supervised post-training. We devise an auto-regressive objective variant of MERIt and integrate it with two LLM series, i.e., FLAN-T5 and LLaMA, with parameter size ranging from 3 billion to 13 billion. The results on two challenging logical reasoning benchmarks demonstrate the effectiveness of LogicLLM.
arXiv Detail & Related papers (2023-05-23T06:13:10Z)
LogiGAN: Learning Logical Reasoning via Adversarial Pre-training [58.11043285534766]
We present LogiGAN, an unsupervised adversarial pre-training framework for improving logical reasoning abilities of language models. Inspired by the facilitation effect of reflective thinking in human learning, we simulate the learning-thinking process with an adversarial Generator-Verifier architecture. Both base and large size language models pre-trained with LogiGAN demonstrate obvious performance improvement on 12 datasets.
arXiv Detail & Related papers (2022-05-18T08:46:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.