Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs
- URL: http://arxiv.org/abs/2402.11442v3
- Date: Fri, 21 Jun 2024 02:36:55 GMT
- Title: Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs
- Authors: Siyuan Wang, Zhongyu Wei, Yejin Choi, Xiang Ren,
- Abstract summary: Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks.
However, their mastery of underlying inferential rules still falls short of human capabilities.
We propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic.
- Score: 87.34281749422756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. To investigate this, we propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic, comprising both primitive and compositional rules across five domains. Our analysis of GPT-series models over a rule subset reveals significant gaps in LLMs' logic understanding compared to human performance, especially in compositional and structural complex rules with certain bias patterns. We further distill these rules into a smaller-scale inference engine for flexible rule generation and enhancing downstream reasoning. Through a multi-judger evaluation, our inference engine proves effective in generating accurate, complex and abstract conclusions and premises, and improve various commonsense reasoning tasks. Overall, our work sheds light on LLMs' limitations in grasping inferential rule and suggests ways to enhance their logical reasoning abilities~\footnote{Code and data are available at \url{https://github.com/SiyuanWangw/ULogic}.}.
Related papers
- LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models [52.03659714625452]
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks.
But, can they really "reason" over the natural language?
This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied.
arXiv Detail & Related papers (2024-04-23T21:08:49Z) - Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems.
LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning.
We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z) - ChatRule: Mining Logical Rules with Large Language Models for Knowledge
Graph Reasoning [107.61997887260056]
We propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs.
Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs.
To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs.
arXiv Detail & Related papers (2023-09-04T11:38:02Z) - Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic [14.503982715625902]
We study a synthetic corpus based approach for language models (LMs) to acquire logical deductive reasoning ability.
We adopt a well-grounded set of deduction rules based on formal logic theory, which can derive any other deduction rules when combined in a multistep way.
We empirically verify that LMs trained on FLD corpora acquire more generalizable reasoning ability.
arXiv Detail & Related papers (2023-08-11T13:15:35Z) - Exploring Self-supervised Logic-enhanced Training for Large Language Models [59.227222647741094]
In this paper, we make the first attempt to investigate the feasibility of incorporating logical knowledge through self-supervised post-training.
We devise an auto-regressive objective variant of MERIt and integrate it with two LLM series, i.e., FLAN-T5 and LLaMA, with parameter size ranging from 3 billion to 13 billion.
The results on two challenging logical reasoning benchmarks demonstrate the effectiveness of LogicLLM.
arXiv Detail & Related papers (2023-05-23T06:13:10Z) - RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs [91.71504177786792]
This paper studies learning logic rules for reasoning on knowledge graphs.
Logic rules provide interpretable explanations when used for prediction as well as being able to generalize to other tasks.
Existing methods either suffer from the problem of searching in a large search space or ineffective optimization due to sparse rewards.
arXiv Detail & Related papers (2020-10-08T14:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.