Related papers: MuSLR: Multimodal Symbolic Logical Reasoning

MuSLR: Multimodal Symbolic Logical Reasoning

URL: http://arxiv.org/abs/2509.25851v1
Date: Tue, 30 Sep 2025 06:42:20 GMT
Title: MuSLR: Multimodal Symbolic Logical Reasoning
Authors: Jundong Xu, Hao Fei, Yuhui Zhang, Liangming Pan, Qijun Huang, Qian Liu, Preslav Nakov, Min-Yen Kan, William Yang Wang, Mong-Li Lee, Wynne Hsu,
Abstract summary: Multimodal symbolic logical reasoning is critical in high-stakes applications such as autonomous driving and medical diagnosis.<n>We introduce the first benchmark Mu SLR for multimodal symbolic logical reasoning grounded in formal logical rules.<n>We propose LogiCAM, a modular framework that applies formal logical rules to multimodal inputs, boosting GPT-4.1's Chain-of-Thought performance by 14.13%.
Score: 133.85551954182105
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Multimodal symbolic logical reasoning, which aims to deduce new facts from multimodal input via formal logic, is critical in high-stakes applications such as autonomous driving and medical diagnosis, as its rigorous, deterministic reasoning helps prevent serious consequences. To evaluate such capabilities of current state-of-the-art vision language models (VLMs), we introduce the first benchmark MuSLR for multimodal symbolic logical reasoning grounded in formal logical rules. MuSLR comprises 1,093 instances across 7 domains, including 35 atomic symbolic logic and 976 logical combinations, with reasoning depths ranging from 2 to 9. We evaluate 7 state-of-the-art VLMs on MuSLR and find that they all struggle with multimodal symbolic reasoning, with the best model, GPT-4.1, achieving only 46.8%. Thus, we propose LogiCAM, a modular framework that applies formal logical rules to multimodal inputs, boosting GPT-4.1's Chain-of-Thought performance by 14.13%, and delivering even larger gains on complex logics such as first-order logic. We also conduct a comprehensive error analysis, showing that around 70% of failures stem from logical misalignment between modalities, offering key insights to guide future improvements. All data and code are publicly available at https://llm-symbol.github.io/MuSLR.

Related papers

LogicGraph : Benchmarking Multi-Path Logical Reasoning via Neuro-Symbolic Generation and Verification [24.91906506651266]
We introduce LogicGraph, the first benchmark aimed to systematically evaluate multi-path logical reasoning.<n>This pipeline yields solver-verified reasoning problems formalized by high-depth multi-path reasoning.<n>We propose a reference-free evaluation framework to rigorously assess model performance in both convergent and divergent regimes.
arXiv Detail & Related papers (2026-02-24T16:04:26Z)
ChaosBench-Logic: A Benchmark for Logical and Symbolic Reasoning on Chaotic Dynamical Systems [0.0]
Large language models (LLMs) excel at natural language tasks but remain brittle in domains requiring precise logical and symbolic reasoning.<n>Chaotic dynamical systems provide an especially demanding test because chaos is deterministic yet often misinterpreted as randomness or complexity.<n>We introduce ChaosBench-Logic, a benchmark that evaluates LLM reasoning across 30 diverse dynamical systems.
arXiv Detail & Related papers (2026-01-05T10:36:40Z)
Training LLMs with LogicReward for Faithful and Rigorous Reasoning [75.30425553246177]
We propose LogicReward, a reward system that guides model training by enforcing step-level logical correctness with a theorem prover.<n>An 8B model trained on data constructed with LogicReward surpasses GPT-4o and o4-mini by 11.6% and 2% on natural language inference and logical reasoning tasks.
arXiv Detail & Related papers (2025-12-20T03:43:02Z)
LOGicalThought: Logic-Based Ontological Grounding of LLMs for High-Assurance Reasoning [33.30049437667383]
High-assurance reasoning requires conclusions that are accurate, verifiable, and grounded in evidence.<n>This paper proposes a novel neurosymbolically-grounded architecture called LOGicalThought.<n>It uses an advanced logical language and reasoner in conjunction with an LLM to construct a dual symbolic graph context and logic-based context.
arXiv Detail & Related papers (2025-10-02T00:06:23Z)
From Ambiguity to Verdict: A Semiotic-Grounded Multi-Perspective Agent for LLM Logical Reasoning [16.381034926435074]
LogicAgent is a semiotic-square-guided framework designed to jointly address logical complexity and semantic complexity.<n>To overcome the semantic simplicity and low logical complexity of existing datasets, we introduce RepublicQA, a benchmark that reaches college-level difficulty.<n>Experiments demonstrate that LogicAgent achieves state-of-the-art performance on RepublicQA, with a 6.25% average gain over strong baselines.
arXiv Detail & Related papers (2025-09-29T13:31:22Z)
Logic Unseen: Revealing the Logical Blindspots of Vision-Language Models [58.456656119178064]
Vision-Language Models (VLMs) have emerged as foundational for multimodal intelligence.<n>However, their capacity for logical understanding remains significantly underexplored.<n>We introduce LogicBench, a benchmark with over 50,000 vision-language pairs across 9 logical categories and 4 diverse scenarios.<n>We propose LogicCLIP, a training framework designed to boost VLMs' logical sensitivity.
arXiv Detail & Related papers (2025-08-15T08:40:13Z)
Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus [13.276829763453433]
Large language models (LLMs) are capable of solving a wide range of tasks, yet they have struggled with reasoning.<n>We propose $textbfAdditional Logic Training (ALT)$, which aims to enhance LLMs' reasoning capabilities by program-generated logical reasoning samples.
arXiv Detail & Related papers (2024-11-19T13:31:53Z)
LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers [60.009969929857704]
Logical reasoning is an important task for artificial intelligence with potential impacts on science, mathematics, and society. In this work, we reformulating such tasks as modular neurosymbolic programming, which we call LINC. We observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate.
arXiv Detail & Related papers (2023-10-23T17:58:40Z)
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning [101.26814728062065]
Large Language Models (LLMs) have shown human-like reasoning abilities but still struggle with complex logical problems. This paper introduces a novel framework, Logic-LM, which integrates LLMs with symbolic solvers to improve logical problem-solving.
arXiv Detail & Related papers (2023-05-20T22:25:38Z)
Discourse-Aware Graph Networks for Textual Logical Reasoning [142.0097357999134]
Passage-level logical relations represent entailment or contradiction between propositional units (e.g., a concluding sentence) We propose logic structural-constraint modeling to solve the logical reasoning QA and introduce discourse-aware graph networks (DAGNs) The networks first construct logic graphs leveraging in-line discourse connectives and generic logic theories, then learn logic representations by end-to-end evolving the logic relations with an edge-reasoning mechanism and updating the graph features.
arXiv Detail & Related papers (2022-07-04T14:38:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.