Goal-Driven Reasoning in DatalogMTL with Magic Sets
        - URL: http://arxiv.org/abs/2412.07259v3
 - Date: Thu, 27 Feb 2025 14:13:20 GMT
 - Title: Goal-Driven Reasoning in DatalogMTL with Magic Sets
 - Authors: Shaoyu Wang, Kaiyue Zhao, Dongliang Wei, Przemysław Andrzej Wałęga, Dingmin Wang, Hongming Cai, Pan Hu, 
 - Abstract summary: DatalogMTL is a powerful rule-based language for temporal reasoning.<n>We introduce a new reasoning method for DatalogMTL which exploits the magic sets technique.<n>We show that the proposed approach significantly and consistently outperformed state-of-the-art reasoning techniques.
 - Score: 4.885086628404422
 - License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
 - Abstract:   DatalogMTL is a powerful rule-based language for temporal reasoning. Due to its high expressive power and flexible modeling capabilities, it is suitable for a wide range of applications, including tasks from industrial and financial sectors. However, due to its high computational complexity, practical reasoning in DatalogMTL is highly challenging. To address this difficulty, we introduce a new reasoning method for DatalogMTL which exploits the magic sets technique -- a rewriting approach developed for (non-temporal) Datalog to simulate top-down evaluation with bottom-up reasoning. We have implemented this approach and evaluated it on publicly available benchmarks, showing that the proposed approach significantly and consistently outperformed state-of-the-art reasoning techniques. 
 
       
      
        Related papers
        - Model Utility Law: Evaluating LLMs beyond Performance through Mechanism   Interpretable Metric [99.56567010306807]
Large Language Models (LLMs) have become indispensable across academia, industry, and daily applications.<n>One core challenge of evaluation in the large language model (LLM) era is the generalization issue.<n>We propose Model Utilization Index (MUI), a mechanism interpretability enhanced metric that complements traditional performance scores.
arXiv  Detail & Related papers  (2025-04-10T04:09:47Z) - JustLogic: A Comprehensive Benchmark for Evaluating Deductive Reasoning   in Large Language Models [51.99046112135311]
We introduce JustLogic, a synthetically generated deductive reasoning benchmark for rigorous evaluation of Large Language Models.
JustLogic is highly complex, capable of generating a diverse range of linguistic patterns, vocabulary, and argument structures.
Our experimental results reveal that most state-of-the-art (SOTA) LLMs perform significantly worse than the human average.
arXiv  Detail & Related papers  (2025-01-24T15:49:10Z) - Enhancing Logical Reasoning in Large Language Models through Graph-based   Synthetic Data [53.433309883370974]
This work explores the potential and limitations of using graph-based synthetic reasoning data as training signals to enhance Large Language Models' reasoning capabilities.
Our experiments, conducted on two established natural language reasoning tasks, demonstrate that supervised fine-tuning with synthetic graph-based reasoning data effectively enhances LLMs' reasoning performance without compromising their effectiveness on other standard evaluation benchmarks.
arXiv  Detail & Related papers  (2024-09-19T03:39:09Z) - Reliable Reasoning Beyond Natural Language [0.047888359248129786]
Large Language models (LLMs) often exhibit limitations in their ability to reason reliably and flexibly.
We propose a neurosymbolic approach that prompts LLMs to extract and encode all relevant information from a problem statement as logical code statements.
We then use a logic programming language (Prolog) to conduct the iterative computations of explicit deductive reasoning.
arXiv  Detail & Related papers  (2024-07-16T04:34:18Z) - MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time [51.5039731721706]
MindStar is a purely inference-based searching method for large language models.
It formulates reasoning tasks as searching problems and proposes two search ideas to identify the optimal reasoning paths.
It significantly enhances the reasoning abilities of open-source models, such as Llama-2-13B and Mistral-7B, and achieves comparable performance to GPT-3.5 and Grok-1.
arXiv  Detail & Related papers  (2024-05-25T15:07:33Z) - Ontological Reasoning over Shy and Warded Datalog$+/-$ for
  Streaming-based Architectures (technical report) [6.689509223124273]
Datalog-based ontological reasoning systems adopt languages, often shared under the collective name of Datalog$ +/-$.
In this paper, we focus on two extremely promising, expressive, and tractable languages, namely, Shy and Warded Datalog$ +/-$.
We leverage their theoretical underpinnings to introduce novel reasoning techniques, technically, "chase variants", that are particularly fit for efficient reasoning in streaming-based architectures.
We then implement them in Vadalog, our reference streaming-based engine, to efficiently solve ontological reasoning tasks over real-world settings.
arXiv  Detail & Related papers  (2023-11-20T23:27:43Z) - InfiMM-Eval: Complex Open-Ended Reasoning Evaluation For Multi-Modal
  Large Language Models [50.03163753638256]
Multi-modal Large Language Models (MLLMs) are increasingly prominent in the field of artificial intelligence.
Our benchmark comprises three key reasoning categories: deductive, abductive, and analogical reasoning.
We evaluate a selection of representative MLLMs using this rigorously developed open-ended multi-step elaborate reasoning benchmark.
arXiv  Detail & Related papers  (2023-11-20T07:06:31Z) - MuSR: Testing the Limits of Chain-of-thought with Multistep Soft   Reasoning [63.80739044622555]
We introduce MuSR, a dataset for evaluating language models on soft reasoning tasks specified in a natural language narrative.
This dataset has two crucial features. First, it is created through a novel neurosymbolic synthetic-to-natural generation algorithm.
Second, our dataset instances are free text narratives corresponding to real-world domains of reasoning.
arXiv  Detail & Related papers  (2023-10-24T17:59:20Z) - Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical   Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems.
LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning.
We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv  Detail & Related papers  (2023-10-02T01:00:50Z) - Seminaive Materialisation in DatalogMTL [10.850687097496373]
DatalogMTL is an extension of Datalog with metric temporal operators.
We propose a materialisation-based procedure to minimise redundant computation.
Our experiments show that our optimised seminaive strategy for DatalogMTL is able to significantly reduce materialisation times.
arXiv  Detail & Related papers  (2022-08-15T10:04:44Z) - Linear Temporal Logic Modulo Theories over Finite Traces (Extended
  Version) [72.38188258853155]
This paper studies Linear Temporal Logic over Finite Traces (LTLf)
 proposition letters are replaced with first-order formulas interpreted over arbitrary theories.
The resulting logic, called Satisfiability Modulo Theories (LTLfMT), is semi-decidable.
arXiv  Detail & Related papers  (2022-04-28T17:57:33Z) - MeTeoR: Practical Reasoning in Datalog with Metric Temporal Operators [12.145849273069627]
We present a novel approach for practical reasoning in DatalogMTL which combines materialisation (a.k.a. forward chaining) with automata-based techniques.
MeTeoR is a scalable system which enables reasoning over complex temporal rules and involving datasets of millions of temporal facts.
arXiv  Detail & Related papers  (2022-01-12T17:46:18Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.