Empowering LLMs with Logical Reasoning: A Comprehensive Survey
- URL: http://arxiv.org/abs/2502.15652v2
- Date: Mon, 24 Feb 2025 19:01:38 GMT
- Title: Empowering LLMs with Logical Reasoning: A Comprehensive Survey
- Authors: Fengxiang Cheng, Haoxuan Li, Fenrong Liu, Robert van Rooij, Kun Zhang, Zhouchen Lin,
- Abstract summary: Large language models (LLMs) have achieved remarkable successes on various natural language tasks.<n>Recent studies have found that there are still significant challenges to the logical reasoning abilities of LLMs.<n>This paper summarizes and categorizes the main challenges into two aspects.
- Score: 49.91445266392609
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have achieved remarkable successes on various natural language tasks. However, recent studies have found that there are still significant challenges to the logical reasoning abilities of LLMs. This paper summarizes and categorizes the main challenges into two aspects: (1) Logical question answering, LLMs often fail to generate the correct answer within complex logical problem which requires sophisticated deductive, inductive or abductive reasoning given a collection of premises and constrains. (2) Logical consistency, LLMs are prone to producing responses contradicting themselves across different questions. For example, a state-of-the-art Macaw question-answering LLM answers Yes to both questions Is a magpie a bird? and Does a bird have wings? but answers No to Does a magpie have wings?. To facilitate this research direction, we comprehensively investigate the most cutting-edge methods and propose detailed taxonomies of these methods. Specifically, to accurately answer complex logic questions, previous methods can be categorized based on reliance on external solvers, prompts, pretraining, and fine-tuning. To avoid logical contradictions, we discuss concepts and solutions of various logical consistencies, including implication, negation, transitivity, factuality consistency, and their composites. In addition, we review commonly used benchmark datasets and evaluation metrics, and discuss promising research directions, such as extensions to modal logic to account for uncertainty, and efficient algorithms satisfying multiple logical consistencies simultaneously.
Related papers
- Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment [54.62926010621013]
We introduce a novel task, code reasoning, to provide a new perspective for the reasoning abilities of large language models.
We summarize three meta-benchmarks based on established forms of logical reasoning, and instantiate these into eight specific benchmark tasks.
We present a new pathway exploration pipeline inspired by human intricate problem-solving methods.
arXiv Detail & Related papers (2025-02-17T10:39:58Z) - SR-FoT: A Syllogistic-Reasoning Framework of Thought for Large Language Models Tackling Knowledge-based Reasoning Tasks [42.392103712958445]
Large Language Models (LLMs) might not follow the correct reasoning paths.<n>We propose a multi-stage Syllogistic-Reasoning Framework of Thought (SR-FoT)<n>Our SR-FoT begins by interpreting the question and then uses the interpretation and the original question to propose a suitable major premise.
arXiv Detail & Related papers (2025-01-20T17:00:41Z) - Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying [0.3659498819753633]
State-of-the-art Large Language models (LLMs) continue to struggle when performing logical and mathematical reasoning.
This paper makes use of the notion of critical questions from the literature on argumentation theory, focusing in particular on Toulmin's model of argumentation.
We show that employing these critical questions can improve the reasoning capabilities of LLMs.
arXiv Detail & Related papers (2024-12-19T18:51:30Z) - Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities [31.728976421529577]
We investigate the contrast across abstract and contextualized logical problems from a comprehensive set of domains.
We focus on standard propositional logic, specifically propositional deductive and abductive logic reasoning.
Our experiments aim to provide insights into disentangling context in logical reasoning and the true reasoning capabilities of LLMs.
arXiv Detail & Related papers (2024-06-04T21:25:06Z) - LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models [52.03659714625452]
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks.
But, can they really "reason" over the natural language?
This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied.
arXiv Detail & Related papers (2024-04-23T21:08:49Z) - Logic Query of Thoughts: Guiding Large Language Models to Answer Complex Logic Queries with Knowledge Graphs [102.37496443389203]
'Logic-Query-of-Thoughts' (LGOT) is the first of its kind to combine knowledge graph reasoning and large language models.<n>Our experimental findings demonstrate substantial performance enhancements, with up to 20% improvement over ChatGPT.
arXiv Detail & Related papers (2024-03-17T17:01:45Z) - Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs [95.07757789781213]
Two lines of approaches are adopted for complex reasoning with LLMs.
One line of work prompts LLMs with various reasoning structures, while the structural outputs can be naturally regarded as intermediate reasoning steps.
The other line of work adopt LLM-free declarative solvers to do the reasoning task, rendering higher reasoning accuracy but lacking interpretability due to the black-box nature of the solvers.
We present a simple extension to the latter line of work. Specifically, we showcase that the intermediate search logs generated by Prolog interpreters can be accessed and interpreted into human-readable reasoning.
arXiv Detail & Related papers (2023-11-16T11:26:21Z) - Language Models can be Logical Solvers [99.40649402395725]
We introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers.
LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers.
arXiv Detail & Related papers (2023-11-10T16:23:50Z) - Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via
Debate [19.887103433032774]
Large language models (LLMs) have shown impressive performance in complex reasoning tasks.
This work explores testing LLMs' reasoning by engaging with them in a debate-like conversation.
We find that despite their impressive performance, LLMs like ChatGPT cannot maintain their beliefs in truth for a significant portion of examples.
arXiv Detail & Related papers (2023-05-22T15:47:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.